Commit Graph

23665 Commits

Author SHA1 Message Date
Pierre-Yves David
3845a0edb5 discovery: run discovery on filtered repository
We have been running discovery on unfiltered repository for quite some time.
This was aimed at two things:

- save some bandwith by prevent the repushing of common but hidden changesets
- allow phases changes on secret/hidden changeset on bare push.

The cost of this unfiltered discovery combined with evolution is actually really
high. Evolution likely create thousand of hidden heads, and the discovery is
going to try to discovery if each of them are common or not. For example,
pushing from my development mercurial repository implies 17 discovery
round-trip.

The benefit are rare corner cases while the drawback are massive. So we run the
discovery on a filtered repository again.

We add some hack to detect remote heads that are known locally and adds them to
the common set anyway, so the good behavior of most of the corner case should
remains. But this will not work in all cases.

This bring my discovery phase back from 17 round-trips to 1 or 2.
2015-01-07 00:07:29 -08:00
FUJIWARA Katsunori
ac41d830e2 revset: check for collisions between alias argument names in the declaration
Before this patch, collisions between alias argument names in the
declaration are ignored, and this silently causes unexpected alias
evaluation.

This patch checks for such collisions, and aborts (or shows a warning) when
collisions are detected.

This patch doesn't add a test to "test-revset.t", because a doctest is
enough to test the collisions detection itself.
2015-01-10 23:18:11 +09:00
FUJIWARA Katsunori
e416b72fc5 revset: parse alias declaration strictly by _parsealiasdecl
Before this patch, alias declaration is parsed by string base
operations: matching against "^([^(]+)\(([^)]+)\)$" and splitting by
",".

This overlooks many syntax errors like below (see the previous patch
introducing "_parsealiasdecl" for detail):

  - un-closed parenthesis causes being treated as "alias symbol"
  - symbol/function name aren't examined whether they are valid or not
  - invalid argument list causes unexpected argument names

To parse alias declaration strictly, this patch replaces parsing
implementation by "_parsealiasdecl".

This patch tests only one typical declaration error case, because
error detection itself is already tested in the doctest of
"_parsealiasdecl".

This also removes class property "args" and "error", because these are
certainly initialized in "revsetalias.__init__".
2015-01-10 23:18:11 +09:00
FUJIWARA Katsunori
87958c780f revset: introduce "_parsealiasdecl" to parse alias declarations strictly
This patch introduces "_parsealiasdecl" to parse alias declarations
strictly. For example, "_parsealiasdecl" can detect problems below,
which current implementation can't.

  - un-closed parenthesis causes being treated as "alias symbol"

    because all of declarations not in "func(....)" style are
    recognized as "alias symbol".

    for example, "foo($1, $2" is treated as the alias symbol.

  - alias symbol/function names aren't examined whether they are valid
    as symbol or not

    for example, "foo bar" can be treated as the alias symbol, but of
    course such invalid symbol can't be referred in revset.

  - just splitting argument list by "," causes overlooking syntax
    problems in the declaration

    for example, all of invalid declarations below are overlooked:

    - foo("bar")     => taking one argument named as '"bar"'
    - foo("unclosed) => taking one argument named as '"unclosed'
    - foo(bar::baz)  => taking one argument named as 'bar::baz'
    - foo(bar($1))   => taking one argument named as 'bar($1)'

To decrease complication of patch, current implementation for alias
declarations is replaced by "_parsealiasdecl" in the subsequent
patch. This patch just introduces it.

This patch defines "_parsealiasdecl" not as a method of "revsetalias"
class but as a one of "revset" module, because of ease of testing by
doctest.

This patch factors some helper functions for "tree" out, because:

  - direct accessing like "if tree[0] == 'func' and len(tree) > 1"
    decreases readability

  - subsequent patch (and also existing code paths, in the future) can
    use them for readability

This patch also factors "_tokenizealias" out, because it can be used
also for parsing alias definitions strictly.
2015-01-10 23:18:11 +09:00
FUJIWARA Katsunori
883b1f7edf revset: store full detail into revsetalias.error for error source distinction
Before this patch, any errors in the declaration of revset alias
aren't detected at all, and there is no information about error source
in the error message.

As a part of preparation for parsing alias declarations and
definitions more strictly, this patch stores full detail into
"revsetalias.error" for error source distinction.

This makes raising "Abort" and warning potential errors just use
"revsetalias.error" without any message composing.
2015-01-10 23:18:11 +09:00
FUJIWARA Katsunori
ae25ee95c4 revset: factor out composing error message for ParseError to reuse
This patch defines the composing function not in "ParseError" class but
in "revset" module, because:

  - "_()" shouldn't be used in "ParseError", to avoid adding "from
    i18n import _" i18n" to "error" module

  - generalizing message composition of"ParseError" for all code paths
    other than revset isn't the purpose of this patch

    we should also take care of showing "unexpected leading
    whitespace" for some code paths, to generalize widely.
2015-01-10 23:18:11 +09:00
FUJIWARA Katsunori
48233206c2 revset: make tokenize extensible to parse alias declarations and definitions
Before this patch, "tokenize" doesn't recognize the symbol starting
with "$" as a valid one.

This prevents revset alias declarations and definitions from being
parsed with "tokenize", because "$" may be used as the initial letter
of alias arguments.

BTW, the alias argument name doesn't require leading "$" itself, in
fact. But we have to assume that users may use "$" as the initial
letter of argument names in their aliases, because examples in "hg
help revsets" uses such names for a long time.

To make "tokenize" extensible to parse alias declarations and
definitions, this patch introduces optional arguments "syminitletters"
and "symletters". Giving these sets can change the policy of "valid
symbol" in tokenization easily.

This patch keeps original examination of letter validity for
reviewability, even though there is redundant interchanging between
"chr"/"ord" at initialization of "_syminitletters" and "_symletters".
At most 256 times examination (per initialization) is cheaper enough
than revset evaluation itself.

This patch is a part of preparation for parsing alias declarations and
definitions more strictly.
2015-01-10 23:18:11 +09:00
Mads Kiilerich
f01f6edb81 largefiles: make linear update set unsure largefiles normal if unchanged
'hg update' would hash all 'unsure' largefiles before performing the merge. It
would update the standins but not detect the very common case where the
largefile never had been changed by the user but just had been marked with an
invalid dirstate mtime to make sure any changes done by the user in the same
second would be detected. The largefile would remain in that state and would
have to be hashed again next time even though it still not had been changed.
Sad trombone.

Instead, for largefiles listed as 'unsure' or 'modified', after updating the
standin with the actual hash, mark the largefile as normal if it turns out to
not be modified relative to the revision in the parent revision. That will
prevent it from being hashed again next time.
2015-01-09 18:38:02 +01:00
Mads Kiilerich
ec0ae429ed debugdirstate: don't hide date field with --nodate, just show 'set'/'unset'
The value of the dirstate date field cannot be used in tests and we thus have
to use debugdirstate with --nodate. It is however still very helpful to be able
to see whether the date field has been set or still is unset. The absence of
that information made it hard to debug some largefile dirstate issues.

This change _could_ make the test suite more unstable ... but that would be
places where the test suite or the code should be made more stable. (Note:
'unset' with the magic negative sizes is reliable. 'unset' for normal sizes
would probably not be reliable, but there is no such occurrences in the test
suite and it should thus be reliable.)

This output wastes more horizontal space in the --nodate output, but it also
makes things simpler that the output format always is the same. It is just a
debug command so let's keep it simple.
2015-01-09 18:38:02 +01:00
Mads Kiilerich
c9f97805d5 debugdirstate: simplify date handling after 8c6c29da8eee used fixed format 2015-01-09 18:38:02 +01:00
Matt Harbison
85dddfb9c5 forget: don't report rejected files as forgotten as well
It seems like a mistake to report a file as forgotten and rejected.  The
forgotten list doesn't seem to be used by anything in core, so no test changes.
2015-01-11 23:25:23 -05:00
Matt Harbison
6d2831411e largefiles: enable subrepo support for forget 2015-01-11 23:20:51 -05:00
Sean Farley
f534500188 namespaces: add revset for 'named(namespace)'
This patch adds functionality for listing all changesets in a given namespace
via the revset language.
2015-01-13 15:07:08 -08:00
Durham Goode
2591767a70 bundles: do not overwrite existing backup bundles (BC)
Previously, a backup bundle could overwrite an existing bundle and cause user
data loss. For instance, if you have A<-B<-C and strip B, it produces backup
bundle B-backup.hg. If you then hg pull -r B B-backup.hg and strip it again, it
overwrites the existing B-backup.hg and C is lost.

The fix is to add a hash of all the nodes inside that bundle to the filename.
Fixed up existing tests and added a new test in test-strip.t
2015-01-09 10:52:14 -08:00
Alex Orange
04223e707f https: support tls sni (server name indication) for https urls (issue3090)
SNI is a common way of sharing servers across multiple domains using separate
SSL certificates. As of Python 2.7.9 SSLContext has been backported from
Python 3. This patch changes sslutil's ssl_wrap_socket to use SSLContext and
take a server hostname as and argument. It also changes the url module to make
use of this argument.

The new code for 2.7.9 achieves it's task by attempting to get the SSLContext
object from the ssl module. If this fails the try/except goes back to what was
there before with the exception that the ssl_wrap_socket functions take a
server_hostname argument that doesn't get used. Assuming the SSLContext
exists, the arguments to wrap_socket at the module level are emulated on the
SSLContext. The SSLContext is initialized with the specified ssl_version. If
certfile is not None load_cert_chain is called with certfile and keyfile.
keyfile being None is not a problem, load_cert_chain will simply expect the
private key to be in the certificate file. verify_mode is set to cert_reqs. If
ca_certs is not None load_verify_locations is called with ca_certs as the
cafile. Finally the wrap_socket method of the SSLContext is called with the
socket and server hostname.

Finally, this fails test-check-commit-hg.t because the "new" function
ssl_wrap_socket has underscores in its names and underscores in its arguments.
All the underscore identifiers are taken from the other functions and as such
can't be changed to match naming conventions.
2015-01-12 18:01:20 -07:00
Matt Mackall
d829c30933 merge with stable 2015-01-14 12:50:46 -08:00
Matt Mackall
f10752833b unpacker: check the right exception type for 2.4 2015-01-13 16:15:02 -08:00
Yuya Nishihara
47e97cb140 revset: fix spanset.isascending() to honor sort() or reverse() request
Because spanset.isascending() ignored the ascending flag, the result of
"fullreposet() & x" was always sorted in ascending order.

The test case is carefully chosen to call fullreposet.__and__.
2015-01-10 21:31:59 +09:00
Anton Shestakov
b80745c666 hgweb: fix diffstat links in paper/changeset.tmpl
'<a .../>foo</a>' syntax is incorrect, since the first tag just "tries" to
close itself and then the actual content follows. It doesn't work, either
because web browsers know better than this or because there should be a
whitespace before /: '<a />'. So for the hgweb users the links looked
normal anyway, but now they are correct in code as well.
2015-01-10 18:00:57 +08:00
Anton Shestakov
298df6419f hgweb: close <img> elements
Templates declare xhtml doctype, which means, in particular, that the document
must also be valid xml. So <img> elements must be closed.
2015-01-10 17:54:24 +08:00
Anton Shestakov
83e4d2ac53 hgweb: close <p> elements
<p> elements can only contain inline elements, so as soon as browser encounters
a block element (e.g. block <div>) "inside" a <p>, it puts an implicit </p>.
It's better to do this explicitly.
2015-01-10 17:52:02 +08:00
Anton Shestakov
ae61ab9707 hgweb: close <th> properly in spartan/filelogentry.tmpl 2015-01-10 17:44:54 +08:00
Yuya Nishihara
bf8b92850c revset: simplify fullreposet.__and__ to call sort() with boolean flag
Note that sort() takes a boolean flag, so other.sort(reverse) was wrong.
It just worked fine because there is a top-level function, reverse().
2015-01-10 21:36:42 +09:00
Augie Fackler
39d629927d hghave: we now support Python 2.7.9's ssl for https 2015-01-13 15:08:55 -05:00
Augie Fackler
d7a053040f Makefile.python: try curl if wget fails
Macs ship with curl and not wget, so this is a nice little tweak for
folks testing on OS X.
2015-01-13 14:15:08 -05:00
Augie Fackler
7d0ebc57bd test-https: glob error messages more so we pass on Python 2.7.9
Python 2.7.9 cleans up how it stringifies SSL errors, so we have to look only
for the important bit (certificate verify failed) rather than looking for
specific ssl module goop (which is now unstable).
2015-01-13 15:15:37 -05:00
Martin von Zweigbergk
c91c749e46 filelog: fix backwards comment for 'backrevref' 2015-01-12 09:46:56 -08:00
Martin von Zweigbergk
63a49977e2 filelog: remove trailing "form feed" character 2015-01-12 09:49:25 -08:00
Martin von Zweigbergk
3e0453e7c4 filelog: remove unused variable 'lkr'
It's used further down, but it's overwritten before, so it's
technically a dead assignment, but unnecessary nevertheless.
2015-01-12 09:48:05 -08:00
Matt Harbison
9825da1159 branchmap: add seek() to end of file before calling tell() on append open()
This is similar to 5274228efcdc, which was subsequently modified in dd809b0d9714
for 2.4.  Unexpected test changes on Windows occurred without this.
2015-01-10 12:00:03 -05:00
Matt Harbison
32718f6e82 tests: fix test-casefolding.t output for branchcache
This belongs with dfd9a7b93d3a.  I assume that the failure to read is OK,
because there is similar output in test-convert-svn-encoding.t.
2015-01-09 22:14:01 -05:00
Pierre-Yves David
4f0528a29a setdiscovery: remove '_setupsample' function
It is now unused.
2015-01-06 17:19:21 -08:00
Pierre-Yves David
c84e477509 setdiscovery: document '_takequicksample' 2015-01-07 20:44:20 -08:00
Pierre-Yves David
2ee7c25abf setdiscovery: drop '_setupsample' usage in '_takequicksample'
For '_takefullsample' we can just retrieve the list of head directly and
ignore the rest of the complex return values. This was the last call to the
infamous '_updatesample' function.
2015-01-06 17:07:44 -08:00
Pierre-Yves David
c05e3eea5d setdiscovery: drop the 'always' argument to '_updatesample'
This argument exists because of the complex code flow in '_takequicksample'. It
first gets the list of heads and then calls '_updatesample' on an empty initial
sample and a size limit matching the differences between the number of heads and
the target sample size. Finally the heads and the sample from '_updatesample'
were added. To ensure this addition result had the exact target length, the code
had to ensure no elements from the heads were added to the '_updatesample'
content and therefore was passing this "always included set of heads".

Instead we can just update the initial heads sample directly and use the final
target size as target size for the update.

This removes the need for this 'always' parameter to the '_updatesample' function

The test are affected because different set building order results in different
random sampling.
2015-01-07 10:32:17 -08:00
Pierre-Yves David
6ff053fa11 setdiscovery: always add exponential sample to the heads
As explained in a previous changeset, prioritizing heads too much behaves
pathologically when there are more heads than the sample size. To counter this,
we always inject exponential samples before reducing to the sample size limit.

This already show some benefit in the test themselves, but on a real-world example
this moves my discovery for push to pathologically headed repo from 45 rounds to
17 of them.

We should maybe ensure that at least 25% of the result sample is heads, but I
think the random sampling will be fine in practice.
2015-01-07 17:28:51 -08:00
Pierre-Yves David
60a9cd0334 setdiscovery: directly run '_updatesample'
The heads and exponential sample are going to end up in the same set
before any extra processing happens. We simplify the code by directly
updating a set with heads.

Changes in the order the set is built lead to small changes in the random
sampling output. But after double checking, I can confirm the input data to
the random sampling is consistent.
2015-01-07 17:23:21 -08:00
Pierre-Yves David
252ba1a3c3 setdiscovery: stop using '_setupsample' in '_takefullsample'
Very few of the return values of '_setupsample' remain in use, so we
directly retrieve the value we care about and drop the '_setupsample'
call.
2015-01-07 17:17:56 -08:00
Pierre-Yves David
e3605ecf1f setdiscovery: randomly pick between heads and sample when taking full sample
Before this changeset, the discovery protocol was too heads-centric. Heads of the
undiscovered set were always sent for discovery and any room remaining in the
sample were filled with exponential samples (and random ones if any room
remained).

This behaved extremely poorly when the number of heads exceeded the sample size,
because we keep just asking about the existence of heads, then their direct parent
and so on. As a result, the 'O(log(len(repo)))' discovery turns into a
'O(len(repo))' one. As a solution we take a random sample of the heads plus
exponential samples. This way we ensure some exponential sampling is achieved,
bringing back some logarithmic convergence of the discovery again.

This patch only applies this principle in one place. More places will be updated
in future patches.

One test is impacted because the random sample happen to be different. By
chance, it helps a bit in this case.
2015-01-07 12:09:51 -08:00
Pierre-Yves David
6141054495 setdiscovery: document the '_updatesample' function
This function is central in the sample building process, having it documented
help code readability a lot.
2015-01-06 17:02:32 -08:00
Pierre-Yves David
2fde36047b setdiscovery: avoid calling any sample building if the undecided set is small
If the length of undecided is smaller than the sample size, we can just request
information for all of them.

This conditional was previously handled by '_setupsample'. But '_setupsample' is
in my opinion a problematic function with blurry semantics. Having this
conditional explicitly earlier makes the code more explicit and moves us closer
to removing this '_setupsample' function.
2015-01-06 16:40:33 -08:00
Pierre-Yves David
ef881538c4 setdiscovery: delay sample building calls to gather them in a single place
Some of the logic around sample building is duplicated in the sample builders,
it would clean up thing to extract it in the top function, but this requires
all codes to be in the same place.

This changeset mostly exists to make the next one more clear.
2015-01-07 09:30:06 -08:00
Pierre-Yves David
ebee9c1c62 setdiscovery: drop unused 'initial' argument for '_takequicksample'
There is a single call site, and it is always using 'initial=True'. So we just drop
the argument and the associated condition.
2015-01-06 16:32:23 -08:00
Matt Mackall
efd707b6d7 readmarkers: add a SHA256 fixme note 2015-01-11 16:46:13 -06:00
Matt Mackall
1b1c572cac readmarkers: fast-path single successors and parents
This gives about a 5% performance bump.
2015-01-11 16:37:57 -06:00
Matt Mackall
aaefce821d readmarkers: promote global constants to locals for performance 2015-01-11 15:35:09 -06:00
Matt Mackall
5a2d46743d readmarkers: drop a temporary 2015-01-11 14:52:57 -06:00
Matt Mackall
f70b8899f1 readmarkers: read node reading into node length conditional
This removes some conditional assignments
2015-01-11 14:51:49 -06:00
Matt Mackall
fd24385ccc readmarkers: drop a temporary
Two other temporaries are renamed to fit line-length.
2015-01-11 14:46:55 -06:00
Matt Mackall
54920da7a8 readmarkers: hoist subtraction out of loop comparison 2015-01-11 14:44:57 -06:00