Previously, revlog.size equals to revlog.rawsize. However, the flag
processor framework could make a difference - "size" could mean the length
of len(revision(raw=False)), while "rawsize" means len(revision(raw=True)).
This patch makes it so.
This corrects "hg status" output when flag processor is involved. The call
stack looks like:
basectx.status -> workingctx._buildstatus -> workingctx._dirstatestatus
-> workingctx._checklookup -> filectx.cmp -> filelog.cmp -> filelog.size
-> revlog.size
This is similar to "revlog: use raw revisions in revdiff". revdiff()
generates raw text used in revlog directly.
This makes test-flagprocessor.t happy.
Similar to fixes in revlog.py, this patch uses "rawtext" to explicitly label
contents expected to be raw, and makes sure content stored in _cache is raw
text.
Now test-flagprocessor.t points us to another issue.
"baserevision" returns the text that will be used to apply deltas. Since
deltas are against raw texts, "baserevision" should return raw text.
Now test-flagprocessor.t points us to a new error.
This shows flag processor is broken with a bundle repo.
The test creates non-liner history to exercise code path where the
deltaparent cannot be reused.
Add the ability for revlog objects to process revision flags and apply
registered transforms on read/write operations.
This patch introduces:
- the 'revlog._processflags()' method that looks at revision flags and applies
flag processors registered on them. Due to the need to handle non-commutative
operations, flag transforms are applied in stable order but the order in which
the transforms are applied is reversed between read and write operations.
- the 'addflagprocessor()' method allowing to register processors on flags.
Flag processors are defined as a 3-tuple of (read, write, raw) functions to be
applied depending on the operation being performed.
- an update on 'revlog.addrevision()' behavior. The current flagprocessor design
relies on extensions to wrap around 'addrevision()' to set flags on revision
data, and on the flagprocessor to perform the actual transformation of its
contents. In the lfs case, this means we need to process flags before we meet
the 2GB size check, leading to performing some operations before it happens:
- if flags are set on the revision data, we assume some extensions might be
modifying the contents using the flag processor next, and we compute the
node for the original revision data (still allowing extension to override
the node by wrapping around 'addrevision()').
- we then invoke the flag processor to apply registered transforms (in lfs's
case, drastically reducing the size of large blobs).
- finally, we proceed with the 2GB size check.
Note: In the case a cachedelta is passed to 'addrevision()' and we detect the
flag processor modified the revision data, we chose to trust the flag processor
and drop the cachedelta.