The next patch will add a flag for strict parsing of early options, where
we'll have to parse all early options at once instead of processing them
one-by-one by dispatch._earlygetopt(). That's why I decided to hook
fancyopts().
All dispatch._early*opt() functions is planned to be replaced with this
function. But in this stable series, only the strict mode will be handled
by fancyopts.earlygetopt().
The list parser is complex and reusable without ui. Let's move it to
config.py.
This allows us to parse a list from a "pure" config object without going
through ui. Like, we can make "_trustusers" calculated from raw configs,
instead of making sure it's synchronized by calling "fixconfig"s.
This allows us to handle bytes in mostly the same manner as Python 2 str,
so we can get rid of ugly s[i:i + 1] hacks:
s = bytestr(s)
while i < len(s):
c = s[i]
...
This is the simpler version of the previous RFC patch which tried to preserve
the bytestr type if possible. New version simply drops the bytestr wrapping
so we aren't likely to pass a bytestr to a function that expects Python 3
bytes.
New revsetlang module hosts parser, tokenizer, and miscellaneous functions
working on parsed tree. It does not include functions for evaluation such as
getset() and match().
2288 mercurial/revset.py
684 mercurial/revsetlang.py
2972 total
get*() functions are aliased since they are common in revset.py.
These classes are pretty large and independent from revset computation.
2961 mercurial/revset.py
973 mercurial/smartset.py
3934 total
revset.prettyformatset() is renamed to smartset.prettyformat(). Smartset
classes are aliased since they are quite common in revset.py.
I'm not entirely happy with using a trailing / on a "file" entry for
transferring a treemanifest. We've discussed putting some flags on
each file header[0], but I'm unconvinced that's actually any better:
if we were going to add another feature to the cg format we'd still be
doing a version bump anyway to cg4, so I'm inclined to not spend time
coming up with a more sophisticated format until we actually know what
the next feature we want to stuff in a changegroup will be.
Test changes outside test-treemanifest.t are only due to the new CG3
bundlecap showing up in the wire protocol.
Many thanks to adgar@google.com and martinvonz@google.com for helping
me with various odd corners of the changegroup and treemanifest API.
0: It's not hard refactoring, nor is it a lot of work. I'm just
disinclined to do speculative work when it's not clear what the
customer would actually be.
A fix for issue2653 with f5abbf51a76e introduced a discrepancy how default
branch should be denoted when converting with branchmap from different SCM.
E.g. for Git and Mercurial you need to use 'default' whilst for Perforce and
SVN you had to use 'None'. This changeset unifies 'default' for such purposes
whilst falling back to 'None' when no 'default' mapping specified.
Instead of re-parsing quoted strings as templates, the tokenizer can delegate
the parsing of nested template strings to the parser. It has two benefits:
1. syntax errors can be reported with absolute positions
2. nested template can use quotes just like shell: "{"{rev}"}"
It doesn't sound nice that the tokenizer recurses into the parser. We could
instead make the tokenize itself recursive, but it would be much more
complicated because we would have to adjust binding strengths carefully and
put dummy infix operators to concatenate template fragments.
Now "string" token without r"" never appears. It will be removed by the next
patch.
Before this patch, "reporelpath()" uses "rstrip(os.sep)" to trim
"os.sep" at the end of "parent.root" path.
But it doesn't work correctly with some problematic encodings on
Windows, because some multi-byte characters in such encodings contain
'\\' (0x5c) as the tail byte of them.
In such cases, "reporelpath()" leaves unexpected '\\' at the beginning
of the path returned to callers.
"lcalrepository.root" seems not to have tail "os.sep", because it is
always normalized by "os.path.realpath()" in "vfs.__init__()", but in
fact it has tail "os.sep", if it is a root (of the drive): path
normalization trims tail "os.sep" off "/foo/bar/", but doesn't trim
one off "/".
So, just avoiding "rstrip(os.sep)" in "reporelpath()" causes
regression around issue3033 fixed by e3dfde137fa5.
This patch introduces "pathutil.normasprefix" to normalize specified
path in the specific way for problematic encodings without regression
around issue3033.
posixpath.split() strips '/' from the dirname *unless it is the root*. This
patch reproduces this behavior in posix.split(). The old behavior causes a
crash when creating a file at the root of the repo with localrepo.wfile()
when the repo is at the root of the filesystem.
Add a doctest with an hopefuly-comprehensive list of combinations
we can expect in real-life situations.
This does not cover corner cases, for example when a CR or LF is
embedded in the name (allowed by RFC 5322!).
Code in tests/test-doctest.py contributed by:
Martin Geisler <mg@aragost.com>
Thanks!
Signed-off-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
# this change redones part of e0051068893a, backed out by 38c00c035629
Some character encodings use ASCII characters other than
control/alphabet/digit as a part of multi-bytes characters, so direct
replacing with such characters on strings in local encoding causes
invalid byte sequences.
[mpm: test changed to simple doctest]
In a date like 10:30, there are two underspecified ends: the specific
end (seconds) and the broad end (day, month, year). When matching
"10:30", we need to allow the specific end to go from 0 to 59 seconds,
while the broad end is assumed to be today's date.
Similar handling applies for a date range like "Mar 1": year is fixed
to today, any time matches.