On platforms not supporting shell expansion, scmutil.match() performs glob
expansion on 'pats' arguments. But _matchfiles() revset calls match.match()
directly, bypassing this behaviour. To avoid duplicating scmutil.match(), a
secondary scmutil.matchandpats() is introduced returning both the match object
and the expanded inputs. Note the expanded pats are also needed in the fast
path, and will be used by --follow code path.
Currently, --follow FILE looks for a FILE filelog, scans it and collects
linkrevs and renames, then filters them. The problem is the filelog scan does
not start at FILE filenode in parent revision but at the last filelog revision.
So:
- Files not in the parent revision can be followed, the starting node is
unexpected
- Files in the parent revision can be followed from an incorrect starting
node.
This patch makes log --follow FILE fail if FILE is not in parent revision, and
computes ancestors of the parent revision FILE filenode.
The filtering logic of match objects cannot be reproduced with the existing
revsets as it operates at changeset files level. A changeset touching "a" and
"b" is matched by "-I a -X b" but not by "file(a) and not file(b)".
To solve this, a new internal "_matchfiles(...)" revset is introduced. It works
like "file(x)" but accepts more than one argument and its arguments are
prefixed with "p:", "i:" and "x:" to be used as patterns, include patterns or
exclude patterns respectively.
The _matchfiles revset is kept private for now:
- There are probably smarter ways to pass the arguments in a user-friendly way
- A "rev:" argument is likely appear at some point to emulate log command
behaviour with regard to filesets: they are evaluated for the parent revision
and applied everywhere instead of being reevaluated for each revision.
Before, a tempfile was used to create a temp file was created with 600
permissions and the uploaded data was written into it. This file was
then *copied* to .hg/largefiles/<hash>.
We now simply use atomictempfile to write the data to a temp file with
the right permissions and then rename that into place.
This replaces another use of tempfile with atomictempfile. The problem
with tempfile is that it creates files with 600 permissions instead of
respecting repo.store.createmode.
Before, the mode was copied from the file in the working copy. This is
inconsistent with how Mercurial normally creates files inside the .hg
folder. This can lead to a situation where you can read the files
under .hg/store but not under .hg/largefiles and so a clone will fail
if it needs to access a largefile.
when pattern which does not match against any files in working context
is specified, current implementation of 'localrepository.status()'
decides whether warning message about it should be shown or not by
'f not in context'
this works correctly for 'file pattern', but not for 'directory
pattern', because 'f not in context' always returns True for
directories, even if they are related to the context.
this patch uses 'changectx.dirs()' to examine whether specified
pattern is related to the context as a directory or not.
this patch adds 'dirs()' to changectx/workingctx, which returns map of
all directories deduced from manifest, to examine whether specified
pattern is related to the context as directory or not quickly.
'workingctx.dirs()' uses 'dirstate.dirs()' rather than building
another copy of it.
This will let use override the "join" value (and/or) depending on the option
considered. The option revset arity is now deduced from the revset and the
option value type, to simplify opt2revset definition.
Using "hg log -G --print-revset" prints the revset generated by graphlog and
exits. This helps debugging and writing shorter tests.
It has been suggested to handle these tests with doctests. I think the
extension approach is better because:
- It tests the actual parameter set passed to graphlog.revset(), not what we
expect it to be. 'branch' and 'only-branch' are currently distinct options
but nothing prevents fancyopts to grow a notion of option aliasing one day,
where both options would be merged before reaching the command.
- It can be used as debug output interleaved with real log calls.
v2:
- Use a test extension instead of a global deprecated new option
The previous code was assuming a default context of 3 lines. When fuzzing, it
would take this value in account to reduce the amount of removed line from
hunks top or bottom. For instance, if a hunk has only 2 lines of bottom
context, fuzzing with fuzz=1 would do nothing and with fuzz=2 it would remove
one of those lines. A hunk with one line of bottom context could not be fuzzed
at all. patch(1) has apparently no such restrictions and takes the fuzz level
at face value.
- test-import.t: fuzz/offset changes at the beginning of file are explained by
the new fuzzing behaviour and match patch(1) ones. Patching locations are
different but those of my patch(1) do not make a lot of sense right now
(patched output are the same)
- test-import-bypass.t: more agressive fuzzing makes a patching supposed to
fail because of context, succeed. Change the diff to avoid this.
- test-mq-merge.t: more agressive fuzzing would allow the merged patch to apply
with fuzz, but fortunately we disallow this behaviour. The new output is
kept.
I have not enough experience with patch(1) fuzzing to know whether aligning our
implementation on it is a good or bad idea. Until now, it has been the
implementation reference. For instance, "qpush" tolerates fuzz (test-mq-merge.t
runs the special case of pushing merge revisions where fuzzing is forbidden).
When applying hunks such as:
@@ -2,1 +2,2 @@
context
+change
fuzzing would empty the "old" block and make patchfile.apply() traceback.
Instead, we apply the new block at specified location without testing.
The "bottom hunk" test was removed as patch(1) has no problem applying hunk
with no context in the middle of a file.
- It moves hunks processing weirdness where it belongs
- It helps reusing said weirdness whenever fuzzit() is called, like during the
actual hunk fuzzing.
Splicemap lines are documented in hg help convert like:
key parent1, parent2
but parsed like:
key, parents = line.strip().rsplit(' ', 1)
parents = parents.replace(',', ' ').split()
The rsplit() call was introduced to handle spaces in keys for the generic
mapfile format. Spaces can appear in svn identifiers since they contain path
components. This logic makes less sense with splicemap since svn identifiers
can also appear on the right side, even if it is a bit less likely. Given the
parsing is theorically broken, I would rather follow what is documented already
and is correct in the main case where all identifiers are hg hashes. Also,
using svn identifiers in a splicemap sounds difficult as they are not easily
accessible.
Files are being replaced by rollback but the corresponding data in localrepo
isn't actually updated for things like bookmarks, phases, etc. Then when
rollback is done, the cache is updated thinking it has the most up-to-date
data, where in fact it is still pre-rollback.
We clear _filecache to force everything to be recreated.
When assigning a new object to filecached properties, the cached object that
was kept in the _filecache map was still holding the old object.
By implementing __set__, we track these changes too and update the cached
copy as well.
The dirstate is invalidated separately outside of invalidate() which is
already being called (other callers of invalidate() seems to suggest the
separation is there for a reason).
There is no reason to discard copy sources from the set of files considered by
addremove(). It was done to handle the case where a first patch would create
'a' and a second one would move 'a' to 'b'. If these patches were applied with
--no-commit, 'a' would first be marked as added, then unlinked and dropped from
the dirstate but still passed to addremove(). A better fix is thus to exclude
removed files which ends being dropped from the dirstate instead of removed.
Reported by Jason Harris <jason@jasonfharris.com>
current 'lfiles_repo.status()' implementation examines whether
specified patterns are related to largefiles in working directory (not
to STANDIN) or not by NOT-EMPTY-NESS of below list:
[f for f in match.files() if f in lfdirstate]
but it can not be assumed that all in 'match.files()' are file itself
exactly, because user may only specify part of path to match whole
under subdirectories recursively.
above examination will mis-recognize such pattern as 'not related to
largefiles', and executes normal 'status()' procedure. so, 'hg status'
shows '?'(unknown) status for largefiles in working directory unexpectedly.
this patch examines relation of pattern to largefiles by applying
'match()' on each entries in lfdirstate and checking wheter there is
no matched entry.
it may increase cost of examination, because it causes of full scan of
entries in lfdirstate.
so this patch uses normal for-loop instead of list comprehensions, to
decrease cost when matching is found.
When sorting revisions before converting them, we have to edit the revision
graph using splicemap entries. Otherwise, a spliced revision may be converted
before its synthetic parents. Invalid splicemap revisions are now detected
before starting the conversion.