Commit Graph

30363 Commits

Author SHA1 Message Date
David Soria Parra
4683ce4a8a convert: allow passing in a revmap
Implement `common.setrevmap` which is used to pass in a file with existing
revision mappings. This functionality is used by `convertcmd.convert` if it
exists and allows implementors such as the p4 converter to make use of an
existing mapping.

We are using the revmap to abort scanning and the repository for more information
if we already have the revision. This means we are allowing incremental imports
in cases where a revmap is provided.
2016-12-14 01:45:57 -08:00
David Soria Parra
cde8421946 convert: use convert_revision for P4 imports
We are using convert_revisions in other importers. In order to unify this
we are also using convert_revision for Perforce in addition to the original
'p4'.
2016-12-13 21:49:58 -08:00
David Soria Parra
eeb4ddccf3 convert: remove unused dictionaries
self.parent, self.lastbranch and self.tags have never been used.
2016-12-14 01:45:17 -08:00
David Soria Parra
5b7b900ca3 convert: self.heads is a list
self.heads is used as a list throughout convert and never a dictionary.
Initialize it correctly to a list.
2016-12-14 01:43:47 -08:00
David Soria Parra
338550df2b convert: don't use long list comprehensions
We are iterating over p4changes. Make the continue condition more clear
and easier to add new conditions in future patches, by removing the list
comprehension and move the condition into the existing for-loop.
2016-12-13 21:49:58 -08:00
Durham Goode
323f27948d changelog: keep track of file end in appender (issue5444)
Previously, changelog.appender.end() would compute the end of the file by
joining all the current appended data and checking the length. This is an O(n)
operation.  449b4adb7d39 introduced a seek call before every revlog write, which
means we are hitting this O(n) behavior n times, which causes changelog writes
during a pull to be n^2.

In our large repo, this caused pulling 100k commits to go from 17s to 130s. With
this fix, it's back to 17s.
2016-12-15 11:00:18 -08:00
Augie Fackler
74b72bf255 tests: fix test-bdiff to handle variance between pure and c bdiff code
Obviously we'd rather patch pure to have the same algorithmic win as
the C code, but this is a quick fix for the pure build since pure
isn't wrong, just not as fast as it could be.
2016-12-15 11:14:00 -05:00
Augie Fackler
7615fa0f90 tests: finish updating test-bdiff to unittest (part 4 of 4) 2016-12-15 11:04:09 -05:00
Augie Fackler
973a0b2065 tests: update more of test-bdiff.py to use unittest (part 3 of 4) 2016-12-15 10:56:26 -05:00
Augie Fackler
d751e0686d tests: update more of test-bdiff.py to use unittest (part 2 of 4) 2016-12-15 10:50:06 -05:00
Augie Fackler
133f8468d3 tests: migrate test-bdiff.py to use unittest (part 1 of 4)
This moves all the test() calls, which were easy and mechanical.
2016-12-15 10:10:15 -05:00
Pierre-Yves David
54022676d2 import-checker: do not enforce lexical sort accross stdlib/local boundary
Before this change, you could get in a start where the checker would either
complain about importing local module before stdlib one or complain about the
local one being wrongly lexically sorted with the stdlib one.

We detect the boundary and avoid complaining about lexical sort across it.
2016-12-15 19:56:48 +01:00
Stanislau Hlebik
da605718f2 cg1packer: fix compressed method
`cg1packer.compressed()` returns True even if `self._type` is 'UN'. This patch
fixes it.
2016-12-14 09:53:56 -08:00
Philippe Pepiot
396c998f12 perf: add historical support of ui.load()
ui.load() has been available since d83ca854 and at the time of writing isn't
available on stable branch breaking benchmarking newer stable revisions.

Add historical portability policy note on contrib/benchmarks
2016-12-15 12:17:08 +01:00
Jun Wu
44c0d5d616 chg: ignore HG_* in confighash
The environment variables `HG_*` are usually used by hooks. Unlike `HGPLAIN`
etc, they do not actually affect hg's behavior. So do not include them in
confighash.

This would avoid spawning an unbound number of chg server processes if
commit hook calls hg frequently.
2016-12-14 02:17:59 +00:00
Pulkit Goyal
9d833da676 py3: make keys of keyword arguments strings
keys of keyword arguments on Python 3 has to be string. We are dealing with
bytes in our codebase so the keys are also bytes. Done that using
pycompat.strkwargs().

Also after this patch, `hg version` now runs on Python 3.5. Hurray!
2016-12-13 20:53:40 +05:30
Jun Wu
1d0e485ba1 error: make it clear that ProgrammingError is for mercurial developers
The word "developer" could refer to users - people using hg are likely to be
developers. Add adjectives to make it refer to mercurial developers only.
2016-12-12 08:01:52 +00:00
Remi Chaintron
cc88d4a3c4 revlog: merge hash checking subfunctions
This patch factors the behavior of both methods into 'checkhash'.
2016-12-13 14:21:36 +00:00
Stanislau Hlebik
420d75485a bookmarks: make bookmarks.comparebookmarks accept binary nodes (API)
Binary bookmark format should be used internally. It doesn't make sense to have
optional parameters `srchex` and `dsthex`. This patch removes them. It will
also be useful for `bookmarks` bundle2 part because unnecessary conversions
between hex and bin nodes will be avoided.
2016-12-09 03:22:26 -08:00
Stanislau Hlebik
420e1ab2a8 bookmarks: rename compare() to comparebookmarks() (API)
Next commit will remove optional parameters from `compare()` function.
Let's rename `compare()` to `comparebookmarks()` to avoid ambiguity from
callers from external extensions.
2016-11-22 01:33:31 -08:00
Gábor Stefanik
b631d16eab graft: support grafting changes to new file in renamed directory (issue5436) 2016-12-05 17:40:01 +01:00
Jun Wu
3f639e27a5 rebase: calculate ancestors for --base separately (issue5420)
Previously, the --base option only works with a single "branch" - if there
is one changeset in the "--base" revset whose branching point(s) is/are
different from another changeset in the "--base" revset, "rebase" will error
out with:

  abort: source is ancestor of destination

This happens if the user has multiple draft branches, and uses "hg rebase -b
'draft()' -d master", for example. The error message looks cryptic to users
who don't know the implementation detail.

This patch changes the logic to calculate the common ancestor for every
"base" changeset separately so we won't (incorrectly) select "source" which
is an ancestor of the destination.

This patch should not change the behavior where all changesets specified by
"--base" have the same branching point(s).

A new situation is: some of the specified changesets could be rebased, while
some couldn't (because they are descendants of the destination, or they do
not share a common ancestor with the destination). The current behavior is
to show "nothing to rebase" and exits with 1.

This patch maintains the current behavior (show "nothing to rebase") even if
part of the "--base" revset could be rebased. A clearer error message may be
"cannot find branching point for X", or "X is a descendant of destination".
The error message issue is tracked by issue5422 separately.

A test is added with all kinds of tricky cases I could think of for now.
2016-11-28 05:45:22 +00:00
Pulkit Goyal
b18d8e2c04 py3: utility functions to convert keys of kwargs to bytes/unicodes
Keys of keyword arguments need to be str(unicodes) on Python 3. We have a lot
of function where we pass keyword arguments. Having utility functions to help
converting keys to unicodes before passing and convert back them to bytes once
passed into the function will be helpful. We now have functions named
pycompat.strkwargs(dic) and pycompat.byteskwargs(dic) to help us.
2016-12-07 21:53:03 +05:30
Pulkit Goyal
3f64a7a3eb py3: make a bytes version of getopt.getopt()
getopt.getopt() deals with unicodes on Python 3 internally and if bytes
arguments are passed, then it will return TypeError. So we have now
pycompat.getoptb() which takes bytes arguments, convert them to unicode, call
getopt.getopt() and then convert the returned value back to bytes and then
return those value.
All the instances of getopt.getopt() are replaced with pycompat.getoptb().
2016-12-06 06:36:36 +05:30
Jun Wu
51bd231f28 parsers: use buffer to store revlog index
Previously, the revlog index passed to parse_index2 must be a "string",
which means we have to read the whole revlog index into memory. This patch
makes the code accept a generic Py_buffer, to be more flexible - it could be
a "string", or anything that implements the buffer interface, like a mmap-ed
region.

Note: ideally we want to remove the "data" field. However, it is still used
in parse_index2:

    if (idx->inlined) {
        cache = Py_BuildValue("iO", 0, idx->data);
        ....
    }
    ....
    tuple = Py_BuildValue("NN", idx, cache);
    ....
    return tuple;

Its only users are revlogio.parseindex and revlog.__init__:

    # revlogio.parseindex
    index, cache = parsers.parse_index2(data, inline)
    return index, getattr(index, 'nodemap', None), cache

    # revlog.__init__
    d = self._io.parseindex(indexdata, self._inline)
    self.index, nodemap, self._chunkcache = d

Maybe we could move the logic (testing inline and returnning "data" object)
to revlog.py. But that should be a separate patch.
2016-12-06 11:44:49 +00:00
Pulkit Goyal
9523fa9b6b fancyopts: switch from fancyopts.getopt.* to getopt.*
In the next patch, we will be creating a bytes version of getopt.getopt() and
doing that will leave getopt as unused import in fancyopts. So before removing
that there are instances in codebase where instead of importing getopt, we
have used fancyopts.getopt. This patch will switch all those cases so that
the next patch can remove the import of getopt from fancyopts without breaking
things.
2016-12-06 06:27:58 +05:30
Pulkit Goyal
617e2aec61 py3: use pycompat.fsdecode() to pass to imp.* functions
When we try to pass a bytes argument to a function from imp library, it
returns TypeError as it deals with unicodes internally. So we can't use bytes
with imp.* functions. Hunting through this, I found we were returning bytes
path variable to loadpath() on Python 3.5 (yes most of our codebase is
dealing with bytes on Python 3 especially the path variables). Passing unicode
does not fails the purpose of loding the extensions and a module object is
returned.
2016-12-05 06:46:51 +05:30
Jun Wu
f9c05a235e localrepo: use ProgrammingError
This is an example usage of ProgrammingError. Let's start migrating
RuntimeError to ProgrammingError.

The code only runs when devel.all-warnings or devel.check-locks is set, so
it does not affect the end-user experience.
2016-12-06 17:06:39 +00:00
Jun Wu
d8c4533301 error: add ProgrammingError
We have requirement to express "this is clearly an error caused by the
programmer". The code base uses RuntimeError for that in some places, not
ideal. So let's add a formal exception for that.
2016-12-06 14:57:47 +00:00
Jun Wu
5b83a79f69 chgserver: call "load" for new ui objects
After 81ed7b0f8a46, we need to call "ui.load" explicitly to load config
files.
2016-12-05 21:36:35 +00:00
Pulkit Goyal
6e996c4d4c localrepository: remove None as default value of path argument in __init__()
The path variable in localrepository.__init__() has a default value None. So
it gives us a option to create an object to localrespository class without
path variable. But things break if you try to do so. The second line in the
init which will be executed when we try to create a localrepository object
will call os.path.expandvars(path) which returns

TypeError: argument of type 'NoneType' is not iterable

I checked occurrences when it is called and can't find any piece of code
which calls it without path variable. Also if something is calling it, its
should break.
2016-12-04 23:22:34 +05:30
Pulkit Goyal
849c625bac py3: use pycompat.sysstr() in __import__()
__import__() on Python 3 accepts strings which are different from that of
Python 2. Used pycompat.sysstr() to get string accordingly.
2016-12-01 13:12:04 +05:30
Pulkit Goyal
4245797c62 py3: avoid use of basestring
"In this case, result is a source variable of a list to be returned, it
shouldn't be unicode. Hence we can use bytes instead of basestring here." -Yuya
2016-11-30 23:51:11 +05:30
Pulkit Goyal
bf0008abe9 py3: use unicodes in __slots__
__slots__ in Python 3 accepts only unicodes and there is no harm using
unicodes in __slots__. So just adding u'' is fine. Previous occurences of this
problem are treated the same way.
2016-11-30 23:38:50 +05:30
Mateusz Kwapich
3de25e93ce memctx: allow the metadataonlyctx thats reusing the manifest node
When we have a lot of files writing a new manifest revision can be expensive.
This commit adds a possibility for memctx to reuse a manifest from a different
commit. This can be beneficial for commands that are creating metadata changes
without any actual files changed like "hg metaedit" in evolve extension.

I will send the change for evolve that leverages this once this is accepted.
2016-11-21 08:09:41 -08:00
Mateusz Kwapich
529860b433 localrepo: make it possible to reuse manifest when commiting context
This makes the commit function understand the context that's reusing manifest.
2016-11-17 10:59:15 -08:00
Mateusz Kwapich
55fff531e4 manifest: expose the parents() method 2016-11-17 10:59:15 -08:00
Gregory Szorc
a071fc938c httppeer: assign Vary request header last
In preparation for adding another value to it in a subsequent patch.

While I was here, I added some empty lines because walls of text
are hard to read.
2016-11-28 21:07:51 -08:00
Gregory Szorc
6a4fd5ab05 wireproto: only advertise HTTP-specific capabilities to HTTP peers (BC)
Previously, the capabilities list was protocol agnostic and we
advertised the same capabilities list to all clients, regardless of
transport protocol.

A few capabilities are specific to HTTP. I see no good reason why we
should advertise them to SSH clients. So this patch limits their
advertisement to HTTP clients.

This patch is BC, but SSH clients shouldn't be using the removed
capabilities so there should be no impact.
2016-11-28 20:46:42 -08:00
Gregory Szorc
2220c845b2 protocol: declare transport protocol name
We add an attribute to the HTTP and SSH protocol implementations
identifying the transport so future patches can conditionally
expose capabilities on a per-transport basis.
2016-11-28 20:46:59 -08:00
Mads Kiilerich
ecd76cc19e bdiff: early pruning of common prefix before doing expensive computations
It seems quite common that files don't change completely. New lines are often
pretty much appended, and modifications will often only change a small section
of the file which on average will be in the middle.

There can thus be a big win by pruning a common prefix before starting the more
expensive search for longest common substrings.

Worst case, it will scan through a long sequence of similar bytes without
encountering a newline. Splitlines will then have to do the same again ...
twice for each side. If similar lines are found, splitlines will save the
double iteration and hashing of the lines ... plus there will be less lines to
find common substrings in.

This change might in some cases make the algorith pick shorter or less optimal
common substrings. We can't have the cake and eat it.

This make hg --time bundle --base null -r 4.0 go from 14.5 to 15 s - a 3%
increase.

On mozilla-unified:
perfbdiff -m 3041e4d59df2
! wall 0.053088 comb 0.060000 user 0.060000 sys 0.000000 (best of 100) to
! wall 0.024618 comb 0.020000 user 0.020000 sys 0.000000 (best of 116)
perfbdiff 0e9928989e9c --alldata --count 10
! wall 0.702075 comb 0.700000 user 0.700000 sys 0.000000 (best of 15) to
! wall 0.579235 comb 0.580000 user 0.580000 sys 0.000000 (best of 18)
2016-11-16 19:45:35 +01:00
Yuya Nishihara
0a5e04d63d formatter: add overview of API and example as doctest 2016-10-22 15:02:11 +09:00
Yuya Nishihara
1d44bd2bbb ui: factor out ui.load() to create a ui without loading configs (API)
This allows us to write doctests depending on a ui object, but not on global
configs.

ui.load() is a class method so we can do wsgiui.load(). All ui() calls but
for doctests are replaced with ui.load(). Some of them could be changed to
not load configs later.
2016-10-22 14:35:10 +09:00
Jun Wu
35a3247fa1 check-code: add a rule to forbid "cp -r"
See the commit message of the previous patch for the reason. In short,
according to the current POSIX standard, "-r" is "removed", and "-R" is the
current standard way to do "copy file hierarchies".
2016-11-30 19:23:04 +00:00
Jun Wu
3ee7ba0bd8 tests: replace "cp -r" with "cp -R"
The POSIX documentation about "cp" [1] says:

  ....

  RATIONALE
    ....
    Earlier versions of this standard included support for the -r option to
    copy file hierarchies. The -r option is historical practice on BSD and
    BSD-derived systems. This option is no longer specified by POSIX.1-2008
    but may be present in some implementations. The -R option was added as a
    close synonym to the -r option, selected for consistency with all other
    options in this volume of POSIX.1-2008 that do recursive directory
    descent.

    The difference between -R and the removed -r option is in the treatment
    by cp of file types other than regular and directory. It was
    implementation-defined how the - option treated special files to allow
    both historical implementations and those that chose to support -r with
    the same abilities as -R defined by this volume of POSIX.1-2008. The
    original -r flag, for historic reasons, did not handle special files any
    differently from regular files, but always read the file and copied its
    contents. This had obvious problems in the presence of special file
    types; for example, character devices, FIFOs, and sockets.
    ....

  ....

  Issue 6
    The -r option is marked obsolescent.
    ....

  Issue 7
    ....
    The obsolescent -r option is removed.
    ....

  (No "Issue 8" yet)

Therefore it's clear that "cp -R" is strictly better than "cp -r".

The issue was discovered when running tests on OS X after 2e4d149e62aa.

[1]: pubs.opengroup.org/onlinepubs/9699919799/utilities/cp.html
2016-11-30 19:25:18 +00:00
Martijn Pieters
ca91b8fcf4 posix: give the cached symlink a real target
The NamedTemporaryFile file is cleared up so checklink ends up as a dangling
symlink, causing cp -r in tests to complain on both Solaris and OS X. Use
a permanent file instead when there is a .hg/cache directory.
2016-11-30 16:39:36 +00:00
Kostia Balytskyi
9b01b1d353 shelve: move patch extension to a string constant
We are using 'name + ".patch"' pattern throughout the shelve code to
identify the existence of a shelve with a particular name. In two
cases however we use 'name + ".hg"' instead. This commit makes
'patch' be used in all places and "emphasizes" it by moving
'patch' to live in a constant. Also, this allows to extract file
name without extension like this:
    f[:-(1 + len(patchextension))]
instead of:
    f[:-6]
which is good IMO.

This is a first patch from this initial "obsshelve" series. This
series does not include tests, although locally I have all of
test-shelve.t ported to test obs-shelve as well. I will send tests
later as a separate series.
2016-11-29 07:20:03 -08:00
Kevin Bullock
8647dcca4a merge with stable 2016-12-01 15:55:45 -06:00
Kevin Bullock
4cdf2cb0d1 Added signature for changeset 6c3b7e698555 2016-12-01 14:13:28 -06:00
Kostia Balytskyi
69bbbd3ecf shelve: fix use of unexpected working dirs in test-shelve.t
Fixing some clowniness where we created ~four levels of nested repos
and once (my test case :( ) did not even cd into a created repo.
2016-11-29 04:11:05 -08:00