Commit Graph

49 Commits

Author SHA1 Message Date
Mike Edgar
3136b352b8 filelog: use censored revlog flag bit to quickly check if a node is censored 2015-01-12 15:29:36 -05:00
Mike Edgar
ab4953ab9b filelog: censored files compare against empty data, have 0 size
To support "status" operations against working directories that are
the children of censored revisions, filelog must define "cmp" and "size"
for censored content.
2014-09-14 20:32:34 -04:00
Mike Edgar
04a8ee7081 filelog: raise CensoredNodeError when hash checks fail with censor metadata
With this change, when a revlog revision hash does not match its content, and
the content is empty with a special metadata key, the integrity failure is
assumed to be intentionally caused to remove sensitive content from repository
history.

To allow different Mercurial functionality to handle this scenario differently
a more specific exception is raised than "ordinary" hash failures.

Alternatives to this approach include, but are not limited to:

- Calling a hook when hashes mismatch to allow arbitrary tombstone validation.
  Cons: Irresponsibly easy to disable integrity checking altogether.
- Returning empty revision data eagerly instead of raising, masking the error.
  Cons: Push/pull won't roundtrip the tombstone, so client repos are unusable.
- Doing nothing differently at this layer. Callers must do their own detection
  of tombstoned data if they want to handle some hash checks and not others.
  - Impacts dozens of callsites, many of which don't have the revision data
  - Would probably be missing one or two callsites at any given time
  - Currently we throw a RevlogError, as do 12 other places in revlog.py.
    Callers would need to parse the exception message and/or ensure
    RevlogError is not thrown from any other part of their call tree.
2014-09-03 22:14:20 -04:00
Mike Edgar
f67e6d6c04 filelog: parsemeta stops returning unused key list
Currently, only the returned meta dictionary is used. An upcoming change will
use the returned text offset.
2014-09-02 14:42:30 -04:00
Mike Edgar
8a41e32180 filelog: make parsemeta a public module function, to be used by censor module 2014-09-10 00:18:15 -04:00
Mike Edgar
bfb97f3a41 filelog: make packmeta a public module function, to be used by censor 2014-09-10 00:17:17 -04:00
Durham Goode
a1ef848104 filelog: use super() for calling base functions
filelog had some hardcoded revlog.revlog.foo() calls. This changes it to
use super() instead so that extensions can replace the filelog base class.
2013-05-01 10:39:37 -07:00
Sune Foldager
3a06c3752e filelog: add file function to open other filelogs 2011-05-10 17:38:58 +02:00
Sune Foldager
f18c48ae86 filelog: extract metadata parsing and packing
_parsemeta returns the dictionary and a list of keys in the order they appear
in metadata. This can be used to repack the dictionary in the same order.

_packmeta creates metadata from a dictionary and an optional key-order list.

In _parsemeta, we use slices and re.search indead of str.index so we can accept
both buffers and strings.
2011-04-30 16:32:50 +02:00
Matt Mackall
06f318150a filelog: move metadata parsing to a helper function 2011-01-06 17:04:47 -06:00
Nicolas Dumazet
70fb2c5315 filelog: cmp: don't read data if hashes are identical (issue2273)
filelog.renamed() is an expensive call as it reads the filelog if p1 == nullid.
It's more efficient to first compute the hash, and to bail early if
the computed hash is the same as the stored nodeid.

'samehashes' variable is not strictly necessary, but helps for comprehension.
2010-07-05 19:49:54 +09:00
Nicolas Dumazet
e3057a06d5 filelog: test behaviour for data starting with "\1\n"
Because "\1\n" is a separator for metadata, data starting with "\1\n" is
handled specifically. It was not tested.

size() call return incorrect data if original data had been "\1\n-escaped".
There's no obvious way to fix it for now, just flag the error in the code
and add an "expected failure" kind of test.
2010-07-05 18:43:46 +09:00
Nicolas Dumazet
6e75efdbcb cmp: document the fact that we return True if content is different
This is similar to the __builtin__.cmp behaviour, but still not
straightforward, as the dailylife meaning of a comparison usually is
"find out if they are different".
2010-07-09 11:02:39 +09:00
Benoit Boissinot
aa7653dcd6 merge with stable 2010-03-16 01:16:19 +01:00
Benoit Boissinot
67b2142e44 filelog: no need to optimize an uncommon case, assume meta = {} 2010-03-16 01:16:04 +01:00
Benoit Boissinot
3c5670f245 filelog: text is stored modified when it starts with '\1\n' 2010-03-16 01:12:46 +01:00
Ronny Pfannschmidt
d170b686d2 filelog: sort meta entries, ensure deterministic order 2010-02-16 21:04:04 +01:00
Matt Mackall
8d99be19f0 many, many trivial check-code fixups 2010-01-25 00:05:27 -06:00
Matt Mackall
595d66f424 Update license to GPLv2+ 2010-01-19 22:20:08 -06:00
Benoit Boissinot
5b1d4a56ec filelog encoding: move the encoding/decoding into store
the escaping of directories ending with .i or .d doesn't
really belong to filelog.

we put the encoding/decoding in store instead, for backwards
compat, streamclone and the fncache file format still uses the
partially encoded filenames.
2009-05-20 18:35:47 +02:00
Martin Geisler
750183bdad updated license to be explicit about GPL version 2 2009-04-26 01:08:54 +02:00
Matt Mackall
b28ccc9a94 revlog: kill from-style imports
They're slow.
2009-01-11 22:55:36 -06:00
Dirkjan Ochtman
574603a8c0 use dict.iteritems() rather than dict.items()
This should be faster and more future-proof. Calls where the result is to be
sorted using util.sort() have been left unchanged. Calls to .items() on
configparser objects have been left as-is, too.
2009-01-12 09:16:03 +01:00
Joel Rosdahl
4f8012378a Remove unused imports 2008-03-06 22:23:41 +01:00
Joel Rosdahl
5dae3059a0 Expand import * to allow Pyflakes to find problems 2008-03-06 22:23:26 +01:00
Christian Ebert
5c18a69d2e Prefer i in d over d.has_key(i) 2008-01-20 14:39:25 +01:00
Thomas Arendsen Hein
4d29c6dc8e Updated copyright notices and add "and others" to "hg version" 2007-06-19 08:51:34 +02:00
Matt Mackall
04561e556e revlog: simplify revlog version handling
- pass the default version as an attribute on the opener
- eliminate config option mess
2007-03-22 19:52:38 -05:00
Matt Mackall
b4f6965b1d revlog: don't pass datafile as an argument 2007-03-22 19:12:03 -05:00
Matt Mackall
f17a4e1934 Replace demandload with new demandimport 2006-12-13 13:27:09 -06:00
Benoit Boissinot
0fba88fde3 use forward "/" for internal path and static http, fix issue437 2006-12-05 16:33:40 +01:00
Brendan Cully
ff2b0fa1de filelog.annotate is now obsolete 2006-09-30 20:56:26 -07:00
Matt Mackall
50e42b8388 filelog: make metadata method private 2006-09-17 22:38:06 -05:00
Brendan Cully
627744e332 Teach annotate to follow copies. 2006-08-18 14:59:18 -07:00
Matt Mackall
11dee4259d merge: use file size stored in revlog index
Add size method to filelog to handle nodes with renames
2006-08-15 22:46:35 -05:00
Matt Mackall
a04e906b9c filelog.cmp: return 0 for equality
spotted by Alexis Carvalho
2006-08-15 16:28:00 -05:00
Matt Mackall
e3e04b8f17 Move cmp bits from filelog to revlog 2006-08-15 14:18:13 -05:00
Matt Mackall
b5a0f2743c filelog: add hash-based comparisons
For status, rather than reconstruct full file versions from revlog for
comparison, compare hashes.
2006-08-14 15:07:00 -05:00
Vadim Gelfer
dc377b58c1 update copyrights. 2006-08-12 12:30:02 -07:00
Benoit Boissinot
7dd019b60b use __contains__, index or split instead of str.find
str.find return -1 when the substring is not found, -1 evaluate
to True and is a valid index, which can lead to bugs.
Using alternatives when possible makes the code clearer and less
prone to bugs. (and __contains__ is faster in microbenchmarks)
2006-07-09 01:30:30 +02:00
Vadim Gelfer
9a0c813fdc use demandload more. 2006-06-20 23:58:21 -07:00
mason@suse.com
58d4ef2538 Use revlogng and inlined data files by default
This changes revlog specify revlogng by default.  Inlined
data files are also used unless a flags option is found in the .hgrc.
Some example hgrc files:

[revlog]
# use the original revlog format
format=0

[revlog]
# use revlogng.  Because no flags are included, inlined data files
# also be selected
format=1

[revlog]
# use revlogng but do not inline the data files with the index
flags=

[revlog]
# the new default
format=1
flags=inline
2006-05-08 14:26:18 -05:00
mason@suse.com
ed26ff0cae Implement revlogng.
revlogng results in smaller indexes, can address larger data files, and
supports flags and version numbers.

By default the original revlog format is used.  To use the new format,
use the following .hgrc field:

[revlog]
# format choices are 0 (classic revlog format) and 1 revlogng
format=1
2006-04-04 16:38:43 -04:00
Matt Mackall
91766807e2 Re-enable the renamed check fastpath 2005-12-22 13:18:44 -06:00
twaldmann@thinkmo.de
55d74a6b77 fixed some stuff pychecker shows, marked unclear/wrong stuff with XXX 2005-11-14 03:59:35 +02:00
twaldmann@thinkmo.de
584aa09dd6 minor optimization: save some string trash 2005-11-14 02:30:19 +02:00
mpm@selenic.com
7f0689647a fix some rename/copy bugs
- delete copy information when we update dirstate

  hg was keeping the copy state and marking things as copied on
  multiple commits

- files that are renamed should have no parents

  if you do a rename/copy to an existing file, it should not be marked
  as descending from its previous revisions.

- remove spurious print from filelog.renamed

- add some more copy tests
2005-08-27 22:04:17 -07:00
mpm@selenic.com
2b4a95a639 Add some rename debugging support 2005-08-27 20:58:53 -07:00
mpm@selenic.com
e175fdde9b Break apart hg.py
- move the various parts of hg.py into their own files
- create node.py to store node manipulation functions
2005-08-27 14:21:25 -07:00