When trying to import a url that ends with a slash
os.path.basename() would return an empty string. So
when seeing a leading slash, remove it and infer the
patch name from the remaining url.
e.g. `hg qimport http://paste.pocoo.org/raw/xxx/`
deduced '.' to be the patch name. Now we'll deduce 'xxx'.
gitpatch objects emitted by iterhunks() were referencing file paths unmodified
from the input patch. _applydif() made them usable by modifying the gitpatch
objects in-place with specified path strip level. The same modified objects
were then reused by iterhunks() generator. _applydiff() now copies and update
the paths which completely decouples both routines.
As a side effect, the "git" event now receives only metadata about
copies/renames to perform the necessary copies ahead of time. Other actions are
handled in the "file" event.
Patch changes are emitted by iterhunks() in two separate events: 'file' when
hunks have to be applied and 'git' to describe other modifications like copies
or mode changes. Note that a file which mode is changed and which content is
modified by the same patch will be emitted in both events. It is more
convenient to handle all file modifications in a single event. This patch
"zips" git actions with regular changes so both kinds can be emitted at the
same place.
git afile/bfile are extracted twice, once when reading a 'diff --git', and
again when reading a unified hunk. The problem is not all git blocks have
unified hunks (renames just have metadata) and they were not extracted the same
way. This is what this patch unifies.
gitpatch objects emitted by iterhunks() are modified in place by applydiff().
Processing them earlier improves iterhunks() isolation. applydiff() modifying
them should still be fixed though.
Failing to do so makes it impossible to use the memctx API to create a
changeset with a commit message or username outside of the current
encoding.encoding setting.
_updatedir() is no longer used by internalpatch()
The change in test-mq-missingfiles.t comes from workingbackend not considering
the missing 'b' file as changed, thus not calling addremove() on it.
This introduces a performance regression for large files, as they will be
copied just to be clobbered afterwards since binary patching does not use
deltas. But it simplifies the code and the previous optimization will be
reintroduced later in a better way.
_applydiff() patcher argument was added to help hgsubversion like extension
monkeypatching the patching process. While it could be removed at this point, I
prefer to leave it until patch.py is completely refactored and there is a valid
and tested alternative.
This greatly improves the speed of the bundling process, and often reduces the
bundle size considerably. (Although if the repository is already ordered, this
has little effect on both time and bundle size.)
For non-generaldelta clients, the reduced bundle size translates to a reduced
repository size, similar to shrinking the revlogs (which uses the exact same
algorithm). For generaldelta clients the difference is minor.
When the new bundle format comes, reordering will not be necessary since we
can then store the deltaparent relationsships directly. The eventual default
behavior for clients and servers is presented in the table below, where "new"
implies support for GD as well as the new bundle format:
old client new client
old server old bundle, no reorder old bundle, no reorder
new server, non-GD old bundle, no reorder[1] old bundle, no reorder[2]
new server, GD old bundle, reorder[3] new bundle, no reorder[4]
[1] reordering is expensive on the server in this case, skip it
[2] client can choose to do its own redelta here
[3] reordering is needed because otherwise the pull does a lot of extra
work on the server
[4] reordering isn't needed because client can get deltabase in bundle
format
Currently, the default is to reorder on GD-servers, and not otherwise. A new
setting, bundle.reorder, has been added to override the default reordering
behavior. It can be set to either 'auto' (the default), or any true or false
value as a standard boolean setting, to either force the reordering on or off
regardless of generaldelta.
Some timing data from a relatively branch test repository follows. All
bundling is done with --all --type none options.
Non-generaldelta, non-shrunk repo:
-----------------------------------
Size: 276M
Without reorder (default):
Bundle time: 14.4 seconds
Bundle size: 939M
With reorder:
Bundle time: 1 minute, 29.3 seconds
Bundle size: 381M
Generaldelta, non-shrunk repo:
-----------------------------------
Size: 87M
Without reorder:
Bundle time: 2 minutes, 1.4 seconds
Bundle size: 939M
With reorder (default):
Bundle time: 25.5 seconds
Bundle size: 381M
Most filesystem calls are already isolated in patchfile but this is not enough:
renames are performed before patchfile is available and some chmod calls are
even done outside of the applydiff call. Once all these calls are extracted
into a backend class, we can provide cleaner APIs to write to a working
directory context directly into the repository.
Previously the connection cache for keepalives didn't keep track of
ssl. This meant that when we connected to an https server after that
same server via http, both on the default port, we'd incorrectly reuse
the non-https connection as the default port meant the connection
cache key was the same.
When in plain mode with "alias" present in the exception list,
keep the aliases. This will be used later to enable auto-completion.
Signed-off-by: "Yann E. MORIN" <yann.morin.1998@anciens.enib.fr>
Let ui.plain() accept an optional parameter in the form of a feature
name (as a string) to exclude from plain mode.
The result of ui.plain is now:
- False if HGPLAIN is not set or the requested feature is in HGPLAINEXCEPT
- True otherwise
Signed-off-by: "Yann E. MORIN" <yann.morin.1998@anciens.enib.fr>