In upcoming patches we'll need to be aware of all parents at the same time.
This also exposes a potential bug: if a line can be annotated with both parents
of a merge commit, it'll always be annotated with p2, not p1. I'm not sure if
that's what we want, but at least the code makes it clear now.
Add a new config field named default-date under the devel section to force all
implicits date to a specific value. If a explicit date is passed, it will
override the default.
This patch only affect changesets. Other usages (blackbox, obsmarkers) are
updated in later patchs.
The test runner is setting a bunch of alias to force the '--date' argument. We
will replace theses aliases in a later patch.
match() will soon gain more logic and we don't want to duplicate that
in icasefsmatch(), so merge the two functions instead and use a flag
to get case-insensitive behavior.
The matcher class is getting hard to understand. It will be easier to
follow if we can break it up into simpler matchers that we then
compose. I'm hoping to have one matcher that accepts regular
(non-include) patterns, one for exact file matches, one that always
matches (and maybe one that never does) and then compose them by
intersection and difference.
This patch takes a simple but important step towards that goal by
making match.match() a function (and renaming the matcher class itself
from "match" to "matcher"). The new function will eventually be
responsible for creating the simple matchers and composing them.
icasefsmatcher similarly gets a factory function (called
"icasefsmatch"). I also moved the other factory functions nearby.
The end goal is to make it possible to avoid potential expensive fctx.data()
when unnecessary.
While memctx is useful for creating new file contexts, there are many cases
where we could reuse an existing raw file revision (amend, histedit, rebase,
process a revision constructed by a remote peer, etc). The overlayfilectx
class is made to support such reuse cases. Together with a later patch, hash
calculation and expensive flag processor could be avoided.
The new method returns the low-level revlog flag. We already have "rawdata"
so a "rawflags" makes sense.
Both "rawflags" and "rawdata" will be used in a later patch.
Make basefilectx._flags a property cache, so basefilectx.flags() could be
reasonably reused. This avoids code duplication between memfilectx and a
class added in a later patch.
We set parent._descendantrev = child.rev() when walking parents in
blockancestors() so that, when linkrev adjustment is perform for these, it
starts from a close descendant instead of possibly topmost introrev. (See
`self._adjustlinkrev(self._descendantrev)` in filectx._changeid().)
This is similar to changeset 8758896efb1c, which added a "f._changeid"
instruction in annotate() for the same purpose.
However, here, we set _descendantrev explicitly instead of relying on the
'_changeid' cached property being accessed (with effect to set _changeid
attribute) so that, in _parentfilectx() (called from parents()), we go through
`if '_changeid' in vars(self) [...]` branch in which instruction
`fctx._descendantrev = self.rev()` finally appears and does what we want.
With this, we can roughly get a 3x speedup (including in example of issue5538
from mozilla-central repository) on usage of followlines revset (and
equivalent hgweb request).
Previously, calling blockancestors() with a fctx not touching file would
sometimes yield this filectx first, instead of the first "block ancestor",
because when compared to its parent it may have changes in specified line
range despite not touching the file at all.
Fixing this by starting the algorithm from the "base" filectx obtained using
fctx.introrev() (as done in annotate()).
In tests, add a changeset not touching file we want to follow lines of to
cover this case. Do this in test-annotate.t for followlines revset tests and
in test-hgweb-filelog.t for /log/<rev>/<file>?linerange=<from>:<to> tests.
If initial 'fctx' has changes in line range with respect to its parents, we
yield it first. This makes 'followlines(..., descend=True)' consistent with
'descendants()' revset which yields the starting revision.
We reuse one iteration of blockancestors() which does exactly what we want.
In test-annotate.t, adjust 'startrev' in one case to cover the situation where
the starting revision does not touch specified line range.
If this assertion fails, this indicates a flaw in the algorithm. So fail fast
instead of possibly producing wrong results.
Also extend the target line range in test to catch a merge changeset with all
its parents.
In the initial implementation of blockdescendants (and thus followlines(...,
descend=True) revset), only the first branch encountered in descending
direction was followed.
Update the algorithm so that all children of a revision ('x' in code) are
considered. Accordingly, we need to prevent a child revision to be yielded
multiple times when it gets visited through different path, so we skip 'i'
when this occurs. Finally, since we now consider all parents of a possible
child touching a given line range, we take care of yielding the child if it
has a diff in specified line range with at least one of its parent (same logic
as blockancestors()).
This is symmetrical with blockancestors() and yields descendants of a filectx
with changes in the given line range. The noticeable difference is that the
algorithm does not follow renames (probably because filelog.descendants() does
not), so we are missing branches with renames.
Previously the sanity check will construct manifestctx for both p1 and p2.
But it only needs the "manifest node" information, which could be read from
changelog directly.
The two function takes the very same arguments. We make this clearer and less
error prone by dispatching on the function only and having a single call point
in the code.
Changeset c832083e5671 removed the mutable default value, but did not explicitly
tested for None. Such implicit testing can introduce semantic and performance
issue. We move to an explicit testing for None as recommended by PEP8:
https://www.python.org/dev/peps/pep-0008/#programming-recommendations
I can't figure out what this branch is even trying to accomplish, and
it was introduced in 387a3aa50d61 which doesn't really shed any
insight into why longs are treated differently from ints.
This removes the uses of manifest.matches in context.py in favor of the new
manifest.diff(match) api. This is part of removing manifest.matches since it is
O(manifest).
To drop the dependency on ctx._manifestmatches(s) we transfer responsibilty for
creating status oriented manifests over to ctx._buildstatusmanifest(s). This
already existed for workingctx, we just need to implement a simple version for
basectx. The old _manifestmatches functionality is basically identical to the
_buildstatusmanifest functionality (minus the matching part), so no behavior
should be lost.
Previously we called self.manifest() in some cases to preload the
first manifest. This relied on the assumption that the later
_manifestmatches() call did not duplicate any work the original
self.manifest() call did. Let's remove that assumption, since it bit
me during my refactors of this area and is easy to remove.
committablectx had a _manifest implementation that was only used by the derived
workingctx class. The other derived versions, like memctx and metadataonlyctx,
define their own _manifest functions.
Let's move the function down to workingctx, and let's break it into two parts,
the _manifest part that reads from self._status, and the part that actually
builds the new manifest. This separation will let us reuse the builder code in a
future patch to answer _buildstatus with varying status inputs, since workingctx
has special behavior for _buildstatus that the other ctx's don't have.
There are several different node markers that indicate different working copy
states. The context._buildstatus function was only handling one of them, and
this patch makes it handle all of them (falling back to file content comparisons
when in one of these states).
This affects a future patch where we get rid of context._manifestmatches as part
of getting rid of manifest.matches(). context._manifestmatches is currently
hacky in that it uses the newnodeid for all added and modified files, which is
why the current newnodeid check is sufficient. When we get rid of this function
and use the normal manifest.diff function, we start to see the other indicators
in the nodes, so they need to be handled or else the tests fail.
This patch introduces a new 'raw' argument (defaults to False) to revlog's
revision() and _addrevision() methods.
When the 'raw' argument is set to True, it indicates the revision data should be
handled as raw data by the flagprocessor.
Note: Given revlog.addgroup() calls are restricted to changegroup generation, we
can always set raw to True when calling revlog._addrevision() from
revlog.addgroup().
This yields ancestors of `fctx` by only keeping changesets touching the file
within specified linerange = (fromline, toline).
Matching revisions are found by inspecting the result of `mdiff.allblocks()`,
filtered by `mdiff.blocksinrange()`, to find out if there are blocks of type
"!" within specified line range.
If, at some iteration, an ancestor with an empty line range is encountered,
the algorithm stops as it means that the considered block of lines actually
has been introduced in the revision of this iteration. Otherwise, we finally
yield the initial revision of the file as the block originates from it.
When a merge changeset is encountered during ancestors lookup, we consider
there's a diff in the current line range as long as there is a diff between
the merge changeset and at least one of its parents (in the current line
range).
When we have a lot of files writing a new manifest revision can be expensive.
This commit adds a possibility for memctx to reuse a manifest from a different
commit. This can be beneficial for commands that are creating metadata changes
without any actual files changed like "hg metaedit" in evolve extension.
I will send the change for evolve that leverages this once this is accepted.
Previously the added/modified placeholder hash for manifests generated from the
dirstate was a 21byte long string consisting of the p1 file hash plus a single
character to indicate an add or a modify. Normal hashes are only 20 bytes long.
This makes it complicated to implement more efficient manifest implementations
which rely on the hashes being fixed length.
Let's change this hash to just be 20 bytes long, and rely on the astronomical
improbability of an actual hash being these 20 bytes (just like we rely on no
hash every being the nullid).
This changes the possible behavior slightly in that the hash for all
added/modified entries in the dirstate manifest will now be the same (so simple
node comparisons would say they are equal), but we should never be doing simple
node comparisons on these nodes even with the old hashes, because they did not
accurately represent the content (i.e. two files based off the same p1 file
node, with different working copy contents would have the same hash (even with
the appended character) in the old scheme too, so we couldn't depend on the
hashes period).
Previously the new-node placeholder hash for manifests generated from the
dirstate was a 21byte long string of "!" characters. Normal hashes are only 20
bytes long. This makes it complicated to implement more efficient manifest
implementations which rely on the hashes being fixed length.
Let's change this hash to just be 20 bytes long, and rely on the astronomical
improbability of an actual hash being 20 "!" bytes in a row (just like we rely
on no hash ever being the nullid).
A future diff will do this for added and modified dirstate markers as well, so
we're putting the new newnodeid in node.py so there's a common place for these
placeholders.
This allows us to access the manifestctx for a given commit. This will be used
in a later patch to be able to copy the manifestctx when we want to make a new
commit.
As part of removing dependencies on manifest, this drops the find function and
fixes up the two existing callers to use the equivalent apis on manifestctx.
Since adjustlinkrev has "self", and is a method of a filectx object, it does
not need path, filelog, filenode. They can be fetched from the "self"
easily.