Commit Graph

112 Commits

Author SHA1 Message Date
Martin von Zweigbergk
f225c55b2a verify: replace _validpath() by matcher
The verifier calls out to _validpath() to check if it should verify
that path and the narrowhg extension overrides _validpath() to tell
the verifier to skip that path. In treemanifest repos, the verifier
calls the same method to check if it should visit a
directory. However, the decision to visit a directory is different
from the condition that it's a matching path, and narrowhg was working
around it by returning True from its _validpath() override if *either*
was true.

Similar to how one can do "hg files -I foo/bar/ -X foo/" (making the
include pointless), narrowhg can be configured to track the same
paths. In that case match("foo/bar/baz") would be false, but
match.visitdir("foo/bar/baz") turns out to be true, causing verify to
fail. This may seem like a bug in visitdir(), but it's explicitly
documented to be undefined for subdirectories of excluded
directories. When using treemanifests, the walk would not descend into
foo/, so verification would pass. However, when using flat manifests,
there is no recursive directory walk and the file path "foo/bar/baz"
would be passed to _validpath() without "foo/" (actually without the
slash) being passed first. As explained above, _validpath() would
return true for the file path and "hg verify" would fail.

Replacing the _validpath() method by a matcher seems like the obvious
fix. Narrowhg can then pass in its own matcher and not have to
conflate the two matching functions (for dirs and files). I think it
also makes the code clearer.
2017-01-23 10:48:55 -08:00
Augie Fackler
1a551b5433 verify: avoid shadowing two variables with a list comprehension
The variable names are clearly worse now, but since we're really just
transposing key and value I'm not too worried about the clarity loss.
2016-11-10 16:35:54 -05:00
Durham Goode
52b8095f37 manifest: remove last uses of repo.manifest
Now that all the functionality has been moved to manifestlog/manifestrevlog/etc,
we can finally change all the uses of repo.manifest to use the new versions. A
future diff will then delete repo.manifest.

One additional change in this commit is to change repo.manifestlog to be a
@storecache property instead of @property. This is required by some uses of
repo.manifest require that it be settable (contrib/perf.py and the static http
server). We can't do this in a prior change because we can't use @storecache on
this until repo.manifest is no longer used anywhere.
2016-11-10 02:13:19 -08:00
Durham Goode
d793e01462 manifest: remove manifest.readshallowdelta
This removes manifest.readshallowdelta and converts its one consumer to use
manifestlog instead.
2016-11-02 17:10:47 -07:00
Anton Shestakov
edb97f0e4a verify: specify unit for ui.progress when checking files 2016-03-11 20:18:41 +08:00
Martin von Zweigbergk
766e80bab5 verify: show progress while verifying dirlogs
In repos with treemanifests, the non-root-directory dirlogs often have
many more total revisions than the root manifest log has. This change
adds progress out to that part of 'hg verify'. Since the verification
is recursive along the directory tree, we don't know how many total
revisions there are at the beginning of the command, so instead we
report progress in units of directories, much like we report progress
for verification of files today.

I'm not very happy with passing both 'storefiles' and 'progress' into
the recursive calls. I tried passing in just a 'visitdir(dir)'
callback, but the results did not seem better overall. I'm happy to
update if anyone has better ideas.
2016-02-11 15:38:56 -08:00
Martin von Zweigbergk
45e493c761 verify: check for orphaned dirlogs
We already report orphaned filelogs, i.e. revlogs for files that are
not mentioned in any manifest. This change adds checking for orphaned
dirlogs, i.e. revlogs that are not mentioned in any parent-directory
dirlog.

Note that, for fncachestore, only files mentioned in the fncache are
considered, there's not check for files in .hg/store/meta that are not
mentioned in the fncache. This is no different from the current
situation for filelogs.
2016-02-03 15:35:15 -08:00
Martin von Zweigbergk
b2b4f9e694 verify: check directory manifests
In repos with treemanifests, there is no specific verification of
directory manifest revlogs. It simply collects all file nodes by
reading each manifest delta. With treemanifests, that's means calling
the manifest._slowreaddelta(). If there are missing revlog entries in
a subdirectory revlog, 'hg verify' will simply report the exception
that occurred while trying to read the root manifest:


  manifest@0: reading delta 1700e2e92882: meta/b/00manifest.i@67688a370455: no node

This patch changes the verify code to load only the root manifest at
first and verify all revisions of it, then verify all revisions of
each direct subdirectory, and so on, recursively. The above message
becomes

  b/@0: parent-directory manifest refers to unknown revision 67688a370455

Since the new algorithm reads a single revlog at a time and in order,
'hg verify' on a treemanifest version of the hg core repo goes from
~50s to ~14s. As expected, there is no significant difference on a
repo with flat manifests.
2016-02-07 21:13:24 -08:00
Martin von Zweigbergk
73fc56cffe verify: extract "manifest" constant into variable
The "manifest" label that's used in error messages will instead be the
directory path for subdirectory manifests (not the root manifest), so
let's extract the constant to a variable already to make future
patches simpler.
2016-02-03 15:53:48 -08:00
Martin von Zweigbergk
8868dab63d verify: use similar language for missing manifest and file revisions
When a changeset refers to a manifest revision that's not found in the
manifest log, we say "changeset refers to missing revision X", but
when a manifest refers to file revision that's not found in the
filelog, we say "X in manifests not found". The language used for
missing manifest revisions seems clearer, so let's use that for
missing filelog revisions too.
2016-02-07 22:46:20 -08:00
Martin von Zweigbergk
0e98867ba2 verify: include "manifest" prefix in a few more places
We include the "manifest" prefix on most other errors, so it seems
consistent to add them to the remaining messages too. Also, having the
"manifest" prefix will be more consistent with having the directory
prefix there when we add support for treemanifests. With the
"manifest" at the beginning, let's remove the now-redundant
"manifest" in the message itself.
2016-02-02 10:42:28 -08:00
Martin von Zweigbergk
bb06be61b6 verify: drop unnecessary check for nullid
In 1f64dad11884 (verify: filter messages about missing null manifests
(issue2900), 2011-07-13), we started ignoring nullid in the list of
manifest nodeids to check. Then, in 954b09a82a61 (verify: do not choke
on valid changelog without manifest, 2012-08-21), we stopped adding
nullid to the list to start with. So let's drop the left-over check
now.
2016-02-02 09:46:14 -08:00
Martin von Zweigbergk
d138e53179 verify: move cross-checking of changeset/manifest out of _crosscheckfiles()
Reasons:

 * _crosscheckfiles(), as the name suggests, is about checking that
   the set of files files mentioned in changesets match the set of
   files mentioned in the manifests.

 * The "checking" in _crosscheckfiles() looked rather strange, as it
   just emitted an error for *every* entry in mflinkrevs. The reason
   was that these were the entries remaining after the call to
   _verifymanifest(). Moving all the processing of mflinkrevs into
   _verifymanifest() makes it much clearer that it's the remaining
   entries that are a problem.

Functional change: progress is no longer reported for "crosschecking"
of missing manifest entries. Since the crosschecking phase takes a
tiny fraction of the verification, I don't think this is a
problem. Also, any reports of "changeset refers to unknown manifest"
will now come before "crosschecking files in changesets and
manifests".
2016-01-31 00:10:56 -08:00
Martin von Zweigbergk
e50c296659 treemanifests: fix streaming clone
Similar to the previous patch, the .hg/store/meta/ directory does not
get copied when when using "hg clone --uncompressed". Fix by including
"meta/" in store.datafiles(). This seems safe to do, as there are only
a few users of this method. "hg manifest" already filters the paths by
"data/" prefix. The calls from largefiles also seem safe. The use in
verify needs updating to prevent it from mistaking dirlogs for
orphaned filelogs. That change is included in this patch.

Since the dirlogs will now be in the fncache when using fncachestore,
let's also update debugrebuildfncache(). That will also allow any
existing treemanifest repos to get their dirlogs into the fncache.

Also update test-treemanifest.t to use an a directory name that
requires dot-encoding and uppercase-encoding so we test that the path
encoding works.
2016-02-04 08:34:07 -08:00
Martin von Zweigbergk
80df9c6505 verify: recover lost freeing of memory
In 0413f674179e (verify: move file cross checking to its own function,
2016-01-05), "mflinkrevs = None" was moved into function, so the
reference was cleared there, but the calling function now held on to
the variable. The point of clearing it was presumably to free up
memory, so let's move the clearing to the calling function where it
makes a difference. Also change "mflinkrevs = None" to "del
mflinkrevs", since the comment about scope now surely is obsolete.
2016-01-31 00:31:55 -08:00
Bryan O'Sullivan
1d0c7077f2 with: use context manager in verify 2016-01-15 13:14:49 -08:00
Martin von Zweigbergk
61149c5496 verify: replace "output parameters" by return values
_verifychangelog() and _verifymanifest() accept dictionaries that they
populate. We pass in empty dictionaries, so it's clearer to create
them in the functions and return them.
2016-01-05 21:25:51 -08:00
Durham Goode
20229ff6fc verify: get rid of some unnecessary local variables
Now that all the major functionality has been refactored out, we can delete some
unused local variables.
2016-01-05 17:08:14 -08:00
Durham Goode
b0d3f36941 verify: move changelog verificaiton to its own function
This makes verify more modular so extensions can hook into it.
2016-01-05 17:08:14 -08:00
Durham Goode
797ef3c770 verify: move manifest verification to its own function
This makes verify more modular, making it easier for extensions to extend.
2016-01-05 18:34:39 -08:00
Durham Goode
828f841525 verify: move file cross checking to its own function
This is part of making verify more modular so extensions can hook into it.
2016-01-05 18:31:51 -08:00
Durham Goode
04c50e28c4 verify: move filelog verification to its own function
This makes verify more modular so extensions can hook in more easily.
2016-01-05 18:28:46 -08:00
Durham Goode
6ce82430dc verify: move checkentry() to be a class function
This is part of making verify more modular so extensions can hook into it.
2016-01-05 17:08:14 -08:00
Durham Goode
14ee73b340 verify: move checklog() onto class
This is part of an effort to make verify more modular so extensions can hook
into it.
2016-01-05 17:08:14 -08:00
Matt Mackall
13d86f9294 verify: clean up weird error/warning lists
Nested functions in Python are not able to assign to variables in the
outer scope without something like the list trick because assignments
refer to the inner scope. So, we formerly used a list to give an
object to assign into.

Now that error and warning are object members, the [0] hack is no
longer needed.
2015-12-20 16:33:44 -06:00
Yuya Nishihara
ca75b0a3eb verify: remove unreachable code to reraise KeyboardInterrupt
KeyboardInterrupt should never be caught as it doesn't inherit Exception in
Python 2.5 or later. And if it was, "interrupted" would be printed twice.

https://docs.python.org/2.7/library/exceptions.html#exception-hierarchy
2015-12-20 18:38:21 +09:00
Durham Goode
1f8eab7615 verify: move exc() function onto class
This is part of an effort to make verify more modular so extensions can hook
into it.
2015-12-18 16:42:39 -08:00
Durham Goode
43f6f64559 verify: move err() to be a class function
This is part of an effort to make it easier for extensions to hook into verify.
2015-12-18 16:42:39 -08:00
Durham Goode
fc349ab32f verify: move warn() to a class level function
This is part of the effort to make verify more modular so extensions can hook
into it more easily.
2015-12-18 16:42:39 -08:00
Durham Goode
37c5c70102 verify: move fncachewarned up to a class variable
This is part of making verify more modular so hooks can extend it.
2015-12-18 16:42:39 -08:00
Durham Goode
81530280b2 verify: move widely used variables into class members
This will allow us to start moving some of the nested functions inside verify()
out onto the class.

This will allow extensions to hook into verify more easily.
2015-12-18 16:42:39 -08:00
Durham Goode
21a5133eda verify: move verify logic into a class
In order to allow extensions to hook into the verification logic more easily, we
need to refactor it into multiple functions. The first step is to move it to a
class so the shared state can be more easily accessed.
2015-12-18 16:42:39 -08:00
Augie Fackler
578db94dcb verify: add a hook that can let extensions manipulate file lists
Without a hook of this nature, narrowhg[0] clones always result in 'hg
verify' reporting terrible damage to the entire repository
history. With this hook, we can ignore files that aren't supposed to
be in the clone, and then get an accurate report of any damage present
(or not) in the repo.

0: https://bitbucket.org/Google/narrowhg
2015-11-04 12:14:18 -05:00
Pierre-Yves David
30913031d4 error: get Abort from 'error' instead of 'util'
The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be
confused about that and gives all the credit to 'util' instead of the
hardworking 'error'. In a spirit of equity, we break the cycle of injustice and
give back to 'error' the respect it deserves. And screw that 'util' poser.

For great justice.
2015-10-08 12:55:45 -07:00
Gregory Szorc
16bb1ed421 verify: use absolute_import 2015-08-08 18:48:10 -07:00
Matt Mackall
89c8dd4cd3 censor: mark experimental option 2015-06-25 17:56:26 -05:00
Gregory Szorc
5380dea2a7 global: mass rewrite to use modern exception syntax
Python 2.6 introduced the "except type as instance" syntax, replacing
the "except type, instance" syntax that came before. Python 3 dropped
support for the latter syntax. Since we no longer support Python 2.4 or
2.5, we have no need to continue supporting the "except type, instance".

This patch mass rewrites the exception syntax to be Python 2.6+ and
Python 3 compatible.

This patch was produced by running `2to3 -f except -w -n .`.
2015-06-23 22:20:08 -07:00
Gregory Szorc
8e1cdd21f4 verify: print hint to run debugrebuildfncache
Corrupt fncache is now a recoverable operation. Inform the user how to
recover from this warning.
2015-06-20 20:11:53 -07:00
Matt Mackall
fa83df39d1 verify: clarify misleading fncache message
This is a message about cache corruption, not repository corruption or
actually missing files. Fix message and reduce to a warning.
2015-06-19 12:00:06 -05:00
Matt Mackall
9fdd0e8abe verify: add a note about a paleo-bug
In the very early days of hg, it was possible to commit /dev/null because our
patch importer was too simple. Repos from this era may still
exist, add a note about why we ignore this name.
2015-03-27 15:13:21 -05:00
Mike Edgar
49d296f5b7 verify: report censored nodes if configured policy is abort 2014-10-14 16:16:04 -04:00
Yuya Nishihara
6852c0bb11 verify: do not prevent verify repository containing hidden changesets
Since afe2bc876c89, repo.cancopy() cannot be used to check if the repo is
a bundlerepository.

repo.url() should always have "scheme:", so it isn't necessary to parse
by util.url().
2014-02-19 22:19:45 +09:00
Pierre-Yves David
a39281497e clfilter: verify logic should be unfiltered
To verify a changelog obviously needs all of it. The verify logic now
ensures it works on an unfiltered repository.
2012-10-08 17:08:52 +02:00
Bryan O'Sullivan
61197562ff verify: fix all doubled-slash sites (issue3665) 2012-10-24 09:27:47 -07:00
Bryan O'Sullivan
dde0f8331e verify: tolerate repeated slashes in a converted repo (issue3665)
These slashes are a hangover from issue3612, fixed in d5787cfaa7cf.

Although the bugfix in that commit is correct, the test it adds
does not replicate the conditions for the bug correctly.
2012-10-22 18:05:40 -07:00
FUJIWARA Katsunori
ec48173940 verify: rename "hasmanifest" variable for source code readability
Before this patch, there are two ambiguous variables: "havemf" and
"hasmanifest".

"havemf" means whether there are any "manifest" entries.

"hasmanifest" means whether there are any "changelog" entries
referring to "manifest" entry.

This patch renames from "hasmanifest" to "refersmf" to clear
difference from "havemf".
2012-10-04 01:24:05 +09:00
FUJIWARA Katsunori
beefacf118 verify: use appropriate local variable in "checkentry()"
Before this patch, "checkentry()" internal function uses both
"node"(argument of itself) and "n"(defined in outer of it) variables.

Because all callers of "checkentry()" use "n" to refer the object
which is passed to "checkentry()" as "node", both can refer same
object in "checkentry()". So, "checkentry()" works correctly.

But such usage is not good for independence of "checkentry()".

This patch replaces "n" in "checkentry()" with "node".
2012-10-04 01:24:05 +09:00
FUJIWARA Katsunori
07219f1b5d verify: use appropriate node information to show verification error
Before this patch, verify module shows verification error message
below:

    unknown parent 2 <HASH_OF_P2> of <HASH_OF_P1>

even though it should show:

    unknown parent 2 <HASH_OF_P2> of <HASH_OF_TARGET>

This patch uses appropriate node information.
2012-10-04 01:24:05 +09:00
Patrick Mezard
1e03a5cb1d verify: do not choke on valid changelog without manifest
Before this change:

  $ hg init
  $ hg branch foo
  $ hg ci -m branchfoo
  $ hg verify
  checking changesets
  checking manifests
   0: empty or missing manifest
  crosschecking files in changesets and manifests
  checking files
  0 files, 1 changesets, 0 total revisions
  1 integrity errors encountered!
  (first damaged changeset appears to be 0)
  [1]
2012-08-21 20:51:16 +02:00
Brodie Rao
a706d64a2c cleanup: replace naked excepts with except Exception: ... 2012-05-12 16:02:46 +02:00