Commit Graph

40 Commits

Author SHA1 Message Date
Jun Wu
41049ab36d amend: use scmutil.cleanupnodes (BC)
This is marked as BC because the strip backup file name has changed.
2017-06-26 15:28:28 -07:00
Augie Fackler
aa7e7bad35 tests: clean up even more direct python calls with $PYTHON
This time ones that are prefixed with =, ", ', or `. This appears to
be the last of them.

Differential Revision: https://phab.mercurial-scm.org/D14
2017-06-20 17:31:18 -04:00
Durham Goode
2971a9e1cb treemanifest: make node reuse match flat manifest behavior
In a flat manifest, a node with the same content but different parents is still
considered a new node. In the current tree manifests however, if the content is
the same, we ignore the parents entirely and just reuse the existing node.

In our external treemanifest extension, we want to allow having one treemanifest
for every flat manifests, as a way of easeing the migration to treemanifests. To
make this possible, let's change the root node treemanifest behavior to match
the behavior for flat manifests, so we can have a 1:1 relationship.

While this sounds like a BC breakage, it's not actually a state users can
normally get in because: A) you can't make empty commits, and B) even if you try
to make an empty commit (by making a commit then amending it's changes away),
the higher level commit logic in localrepo.commitctx() forces the commit to use
the original p1 manifest node if no files were changed. So this would only
affect extensions and automation that reached passed the normal
localrepo.commit() logic straight into the manifest logic.
2017-03-01 16:19:41 -08:00
Durham Goode
f20d9939b3 treemanifest: add tests covering hg diff of partial trees
Previously the hg files tests also covered the logic (i.e.
treemanifest.matches) that governed how hg diff limited its diff. In a future
patch we will be switching treemanifest.diff() to have a custom implementation,
so let's go ahead and add equivalent test coverage for hg diff.
2017-03-07 18:29:58 -08:00
Jun Wu
3ee7ba0bd8 tests: replace "cp -r" with "cp -R"
The POSIX documentation about "cp" [1] says:

  ....

  RATIONALE
    ....
    Earlier versions of this standard included support for the -r option to
    copy file hierarchies. The -r option is historical practice on BSD and
    BSD-derived systems. This option is no longer specified by POSIX.1-2008
    but may be present in some implementations. The -R option was added as a
    close synonym to the -r option, selected for consistency with all other
    options in this volume of POSIX.1-2008 that do recursive directory
    descent.

    The difference between -R and the removed -r option is in the treatment
    by cp of file types other than regular and directory. It was
    implementation-defined how the - option treated special files to allow
    both historical implementations and those that chose to support -r with
    the same abilities as -R defined by this volume of POSIX.1-2008. The
    original -r flag, for historic reasons, did not handle special files any
    differently from regular files, but always read the file and copied its
    contents. This had obvious problems in the presence of special file
    types; for example, character devices, FIFOs, and sockets.
    ....

  ....

  Issue 6
    The -r option is marked obsolescent.
    ....

  Issue 7
    ....
    The obsolescent -r option is removed.
    ....

  (No "Issue 8" yet)

Therefore it's clear that "cp -R" is strictly better than "cp -r".

The issue was discovered when running tests on OS X after 2e4d149e62aa.

[1]: pubs.opengroup.org/onlinepubs/9699919799/utilities/cp.html
2016-11-30 19:25:18 +00:00
Augie Fackler
ba4d11b62e bundlerepo: add support for treemanifests in cg3 bundles
This is a little messier than I'd like, and I'll probably come back
and do some more refactoring later, but as it is this unblocks
narrowhg. An alternative approach (which I may do as part of the
mentioned refactoring) would be to construct *all* dirlog instances up
front, so that we don't have to keep track of the linkmapper
method. This would avoid a reference cycle between the bundlemanifest
and the bundlerepository, but I was hesitant to do all the work up
front like that.

With this change, it's possible to do 'hg incoming' and 'hg pull' from
bundles in .hg/strip-backup in a treemanifest repository. Sadly, this
doesn't make it possible to 'hg clone' one of those (if you do 'hg
strip 0'), because the cg3 in the bundle gets written without a
treemanifest flag. Since that's going to be an involved refactor in a
different part of the code (which I *suspect* won't touch any of the
code I've just written here), let's leave it as an idea for Later.
2016-08-05 13:08:11 -04:00
Augie Fackler
a445a45d1d test-treemanifest: ensure manifest command isn't broken
I realized we weren't testing this while hunting a broken manifest
command bug that ended up being narrowhg's fault.
2016-07-28 16:27:35 -04:00
Martin von Zweigbergk
82a5e7d944 treemanifests: actually strip directory manifests
Stripping has only partly worked since f41815302d49 (repair: use cg3
for treemanifests, 2016-01-19): the bundle seems to have been created
correctly, but revlog entries in subdirectory revlogs were not
stripped. This meant that e.g. "hg verify" would fail after stripping
in a tree manifest repo.

To find the revisions to strip, we simply iterate over all directories
in the repo (included in store.datafiles()). This is inefficient for
stripping few commits, but efficient for stripping many commits. To
optimize for stripping few commits, we could instead walk the tree
from the root and find modified subdirectories, just like we do in the
changegroup code. I'm leaving that for another day.
2016-06-30 13:06:19 -07:00
Martin von Zweigbergk
6612ed3d4a changegroup: don't send empty subdirectory manifest groups
When grafting/rebasing, it is common for multiple changesets to make
the same change to a subdirectory. When writing the revlog for the
directory, the revlog code already takes care of not writing the entry
again. In 3eb9fa4180d3 (changegroup: prune subdirectory dirlogs too,
2016-02-12), I added the corresponding code in changegroup (not
sending entries the client already has), but I forgot to avoid sending
the entire changegroup if no nodes remained in the pruned
set. Although that's harmless besides the wasted network traffic, the
receiving side was checking for it (copied from the changegroup code
for handling files). This resulted in the client crashing with:

  abort: received dir revlog group is empty

Fix by simply not emitting a changegroup for the directory if there
were no changes is it. This matches how files are handled.
2016-06-16 15:15:33 -07:00
Martin von Zweigbergk
3215d75682 bundle: avoid crash when no good changegroup version found
When using treemanifests, only changegroup3 bundles can be
created. However, there is currently no way of requesting a
changegroup3 bundle, so we run into an assertion in
changegroup.getbundler() when trying to get a changroup2
bundler. Let's avoid the traceback and print a short error message
instead.
2016-03-25 23:05:32 -07:00
Danek Duvall
52c3c76c32 tests: Solaris cp doesn't support the -T option
The treemanifest tests use the -T option to cp in order to ensure that the
two directories named on the commandline are treated as peers, rather than
the usual behavior when the final argument is a directory.  GNU cp has this
option, but other implementations may not.  Thankfully, there's no pressing
reason to use it.  We can simply copy the contents of the first directory
into the target directory, since we know that the target directory already
exists.
2016-03-02 14:50:37 -08:00
Martin von Zweigbergk
58c3ff9aaf changegroup: fix treemanifests on merges
The current code for generating treemanifest revisions takes the list
of files in the changeset and finds the directories from them. This
does not work for merges, since a merge may pick file A from one side
and file B from another and neither of them would appear in the
changeset's "files" list, but the manifest would still change.

Fix this by instead walking the root manifest log for all needed
revisions, storing all needed file and subdirectory revisions, then
recursively visiting the subdirectories. This also turns out to be
faster: cloning a version of hg core converted to treemanifests went
from ~28s to ~19s (timing somewhat unfair: before this patch, timed
until crash; after this patch, timed until manifests complete).

The new algorithm is used only on treemanifest repos. Although it
works equally well on flat manifests, we leave the iteration over
files in the changeset for flat manifests for now.
2016-02-12 23:09:09 -08:00
Tony Tung
0ed99a4989 treemanifest: use "cp xyz/." instead of "cp xyz/*"
This is more similar to cp -T because it covers hidden files.
2016-02-23 17:22:51 -08:00
Martin von Zweigbergk
45e493c761 verify: check for orphaned dirlogs
We already report orphaned filelogs, i.e. revlogs for files that are
not mentioned in any manifest. This change adds checking for orphaned
dirlogs, i.e. revlogs that are not mentioned in any parent-directory
dirlog.

Note that, for fncachestore, only files mentioned in the fncache are
considered, there's not check for files in .hg/store/meta that are not
mentioned in the fncache. This is no different from the current
situation for filelogs.
2016-02-03 15:35:15 -08:00
Martin von Zweigbergk
b2b4f9e694 verify: check directory manifests
In repos with treemanifests, there is no specific verification of
directory manifest revlogs. It simply collects all file nodes by
reading each manifest delta. With treemanifests, that's means calling
the manifest._slowreaddelta(). If there are missing revlog entries in
a subdirectory revlog, 'hg verify' will simply report the exception
that occurred while trying to read the root manifest:


  manifest@0: reading delta 1700e2e92882: meta/b/00manifest.i@67688a370455: no node

This patch changes the verify code to load only the root manifest at
first and verify all revisions of it, then verify all revisions of
each direct subdirectory, and so on, recursively. The above message
becomes

  b/@0: parent-directory manifest refers to unknown revision 67688a370455

Since the new algorithm reads a single revlog at a time and in order,
'hg verify' on a treemanifest version of the hg core repo goes from
~50s to ~14s. As expected, there is no significant difference on a
repo with flat manifests.
2016-02-07 21:13:24 -08:00
timeless
354922c6cb tests: put test-treemanifest.t on a port diet
test-treemanifest.t had introduced HGPORT3 and HGPORT4,
which were improperly added to run-tests.py.

It also was not using HGPORT1.
This recycles HGPORT, and shifts everything into HGPORT1 + HGPORT2.
2016-02-17 19:34:01 +00:00
Martijn Pieters
241d8f86a5 treemanifest: don't use cp -T, not supported on OS X
The OS X cp implementation has no -T switch. Copy directory contents using a
glob instead.
2016-02-11 13:50:38 +00:00
Martin von Zweigbergk
d7ef7dd40a treemanifest: fix debugrebuildfncache
When I taught debugrebuildfncache about dirlogs in ebe9dacc63ba
(treemanifests: fix streaming clone, 2016-02-04), I added a
last-minute "if 'treemanifest' in repo" guard. That should have been
checking for "... in repo.requirements". Fix that and add tests for
it.
2016-02-07 21:44:38 -08:00
Martin von Zweigbergk
e50c296659 treemanifests: fix streaming clone
Similar to the previous patch, the .hg/store/meta/ directory does not
get copied when when using "hg clone --uncompressed". Fix by including
"meta/" in store.datafiles(). This seems safe to do, as there are only
a few users of this method. "hg manifest" already filters the paths by
"data/" prefix. The calls from largefiles also seem safe. The use in
verify needs updating to prevent it from mistaking dirlogs for
orphaned filelogs. That change is included in this patch.

Since the dirlogs will now be in the fncache when using fncachestore,
let's also update debugrebuildfncache(). That will also allow any
existing treemanifest repos to get their dirlogs into the fncache.

Also update test-treemanifest.t to use an a directory name that
requires dot-encoding and uppercase-encoding so we test that the path
encoding works.
2016-02-04 08:34:07 -08:00
Martin von Zweigbergk
bb8190a058 treemanifests: fix local clone
When doing a local clone with treemanifests, the .hg/store/meta/
directory currently does not get copied. To fix it, all we need to do
is to add it to the list of directories to copy.
2016-02-02 17:31:17 -08:00
Martin von Zweigbergk
1a2d253a65 tests: simplify treemanifest test by backing up entire .hg/store 2016-02-03 15:35:23 -08:00
Martin von Zweigbergk
86ca76bafe changegroup: fix pulling to treemanifest repo from flat repo (issue5066)
In b89de5ee5b31 (changegroup: don't support versions 01 and 02 with
treemanifests, 2016-01-19), I stopped supporting use of cg1 and cg2
with treemanifest repos. What I had not considered was that it's
perfectly safe to pull *to* a treemanifest repo using any changegroup
version. As reported in issue5066, I therefore broke pull from old
repos into a treemanifest repo. It was not covered by the test case,
because that pulled from a local repo while enabling treemanifests,
which enabled treemanifests on the source repo as well. After
switching to pulling via HTTP, it breaks.

Fix by splitting up changegroup.supportedversions() into
supportedincomingversions() and supportedoutgoingversions().
2016-01-27 09:07:28 -08:00
Martin von Zweigbergk
19285a7ebc tests: minor cleanup to treemanifest test 2016-01-28 13:49:05 -08:00
Augie Fackler
db82034373 changegroup: fix treemanifest exchange code (issue5061)
There were two mistakes: one was accidental reuse of the fclnode
variable from the loop gathering file nodes, and the other (masked by
that bug) was not correctly handling deleted directories. Both cases
are now fixed and the test passes.
2016-01-27 10:24:25 -05:00
Martin von Zweigbergk
d1531da666 exchange: set 'treemanifest' param on pushed changegroups too
In 7a1ccfe03f74 (treemanifests: set bundle2 part parameter indicating
treemanifest, 2016-01-08), I didn't realize I had to set the parameter
separately for getbundle and unbundle. Having the parameter there on
push allows us to push to an empty repo and have the requirements
updated correctly.
2016-01-22 16:31:50 -08:00
Martin von Zweigbergk
c28812c552 shelve: use cg3 for treemanifests
Similar to previous change, this teaches shelve to pick the right
changegroup version for repos that use treemanifests.
2016-01-19 15:37:07 -08:00
Martin von Zweigbergk
857d2206c3 repair: use cg3 for treemanifests
The newly created helper changegroup.safeversion() knows to pick
version 03 if the repo uses treemanifests, so just using that means we
pick the right changegroup version.
2016-01-19 15:38:24 -08:00
Martin von Zweigbergk
63c15f247e changegroup3: introduce experimental.changegroup3 boolean config
In order to give us the freedom to change the changegroup3 format,
let's hide it behind an experimental config. Since it is required by
treemanifests, that will override the cg3 config.
2016-01-12 21:23:45 -08:00
Augie Fackler
d33d6a0cb5 changegroup: introduce cg3, which has support for exchanging treemanifests
I'm not entirely happy with using a trailing / on a "file" entry for
transferring a treemanifest. We've discussed putting some flags on
each file header[0], but I'm unconvinced that's actually any better:
if we were going to add another feature to the cg format we'd still be
doing a version bump anyway to cg4, so I'm inclined to not spend time
coming up with a more sophisticated format until we actually know what
the next feature we want to stuff in a changegroup will be.

Test changes outside test-treemanifest.t are only due to the new CG3
bundlecap showing up in the wire protocol.

Many thanks to adgar@google.com and martinvonz@google.com for helping
me with various odd corners of the changegroup and treemanifest API.

0: It's not hard refactoring, nor is it a lot of work. I'm just
disinclined to do speculative work when it's not clear what the
customer would actually be.
2015-12-11 11:23:49 -05:00
Martin von Zweigbergk
8efd14d515 manifest: use 't' for tree manifest flag
We currently use 'd' to indicate that a manifest entry is a
directory. Let's switch to 't', since that's not a valid hex digit and
therefore easier to spot in the raw manifest data.

This will break any existing repos with tree manifests, but it's still
an experimental feature and there are probably only a few test repos
in existence with 'd' flags.
2015-12-04 14:24:45 -08:00
Martin von Zweigbergk
7a4ad651b5 revlog: don't consider nullrev when choosing delta base
In the most complex case, we try using the incoming delta base, then
we try both parents, and then we try the previous revlog entry. If
none of these result in a good delta, we natually use the null
revision as base. However, we sometimes consider the nullrev before we
have exhausted our other options. Specifically, when both parents are
null, we use the nullrev as delta base if it produces a good delta
(according to _isgooddelta()), and we fail to try the previous revlog
entry as delta base. After e60126c6093d (addrevision: use general
delta when the incoming base delta is bad, 2015-12-01), it can also
happen for non-merge commits when the incoming delta is not good.

The Firefox repo (from many months back) shrinks a tiny bit with this
patch: from 1.855GB to 1.830GB (1.4%). The hg repo itself shrinks even
less: by less than 0.1%. There may be repos that get larger instead.

This undoes the unexplained test change in e60126c6093d.
2015-12-04 17:46:56 -08:00
Pierre-Yves David
2d40bbaa65 test: enable generaldelta early in 'test-treemanifest.t'
Having generaldelta on results in minor test output changes (as we are staring
at the revlog).
2015-10-19 11:28:31 +02:00
Matt Harbison
75b85f98ed test-treemanifest: add globs for Windows 2015-06-01 22:46:05 -04:00
Martin von Zweigbergk
0bfa24333c treemanifest: visit directory 'foo' when given e.g. '-X foo/ba?'
For globs like 'foo/ba?', match._roots() will return 'foo'. Since
visitdir(), excludes directories in the excluded roots, it would skip
the entire foo directory. This is incorrect, since 'foo/ba?' doesn't
mean that everything in foo/ should be exluded. Note that visitdir()
is called only from the treemanifest class, so this only affects tree
manifests. Fix by adding roots to the set of excluded roots only if
there are no excluded patterns.

Since 'glob' is the default pattern type for globs, we also need to
update some -X patterns in the tests to be of 'path' type to take
advantage of the visitdir tricks. For consistency, also update the -I
patterns.

It seems a little unfortunate that 'foo' in 'hg files -X foo' is
considered a pattern because of the implied 'glob' type, but improving
that is left for another day.
2015-05-27 10:44:04 -07:00
Matt Harbison
5d5e89d36a test-treemanifest: add globs for Windows 2015-05-27 12:14:10 -04:00
Drew Gottlieb
eb5e31d8eb match: have visitdir() consider includes and excludes
match.visitdir() used to only look at the match's primary pattern roots to
decide if a treemanifest traverser should descend into a particular directory.
This change logically makes visitdir also consider the match's include and
exclude pattern roots (if applicable) to make this decision.

This is especially important for situations like using narrowhg with multiple
treemanifest revlogs.
2015-05-18 14:29:20 -07:00
Martin von Zweigbergk
e9f7136157 treemanifest: lazily load manifests
Most operations on treemanifests already visit only relevant
submanifests. Notable examples include __getitem__, __contains__,
walk/matches with matcher, diff. By making submanifests lazily loaded,
we speed up all these operations.

The lazy loading is achieved by adding a _load() method that gets
defined where we currently eagerly parse the manifest. We make sure to
call it before any access to _dirs, _files or _flags.

Some timings on the Mozilla repo (with flat manifest timings for
reference):

hg cat -r . README.txt: 1.644s -> 0.096s (0.255s)
hg diff -r .^ -r .    : 1.746s -> 0.137s (0.431s)
hg files -r . python  : 1.508s -> 0.146s (0.335s)
hg files -r .         : 2.125s -> 2.203s (0.712s)
2015-04-09 17:14:35 -07:00
Matt Harbison
dd093fd805 test-treemanifest: add globs for Windows 2015-05-18 11:37:29 -04:00
Martin von Zweigbergk
decbcc4c31 treemanifest: add --dir option to debug{revlog,data,index}
It should be possible to debug the submanifest revlogs without having
to know where they are stored (in .hg/store/meta/), so let's add a
--dir option for this purpose.
2015-04-12 23:51:06 -07:00
Martin von Zweigbergk
1acf6c029c treemanifest: store submanifest revlog per directory
With this change, when tree manifests are enabled (in .hg/requires),
commits will be written with one manifest revlog per directory. The
manifest revlogs are stored in
.hg/store/meta/$dir/00manifest.[id].

Flat manifests can still be read and interacted with as usual (they
are also read into treemanifest instances). The functionality for
writing treemanifest as a flat manifest to disk is still left in the
code; tests still pass with '_treeinmem=True' hardcoded.

Exchange is not yet implemented.
2015-04-13 23:21:02 -07:00