This path is also used for extdiff, which is how I crossed paths with it.
Without this, an AttributeError occurs looking for 'lfstatus' on
localrepository. See also ca0085e432d6.
The other archive method is for the archival.py override, so it doesn't need to
be special cased like this. (It looks like it is only called for the top level
repo.) Likewise, the transplant override is also for commands.py. The other
overrides set lfstatus before examining it.
BTW, C implementation of hexdigest() for SHA-1/256/512 returns hex
hash in lower case, and doctest in Python standard hashlib assumes
that, too. But it isn't explicitly described in API document or so.
Therefore, we can't assume that hexdigest() always returns hex hash in
lower case, for any hash algorithms, on any Python runtimes and
versions.
From point of view of that, it is reasonable for portability that
77f8c025a6ef applies lower() on hex hash in overridefilemerge().
But on the other hand, in largefiles extension, there are still many
code paths comparing between hex hashes or storing hex hash into
standin file, without lower().
Switching to hash algorithm other than SHA-1 may be good chance to
clarify our policy about hexdigest()-ed hash value string.
- assume that hexdigest() always returns hex hash in lower case, or
- apply lower() on hex hash in appropriate layers to ensure
lower-case-ness of it for portability
Before this patch, updatestandin() takes "standin" argument, and
applies splitstandin() on it to pick out a path to largefile (aka
"lfile" or so) from standin.
But in fact, all callers already knows "lfile". In addition to it,
many callers knows both "standin" and "lfile".
Therefore, making updatestandin() take only one of "standin" or
"lfile" is inefficient.
Before this patch, this code path contains two loops for m._files: one
for replacement with standin, and another for elimination of None,
which comes from previous replacement ("standin in wctx or
lfdirstate[f] == 'r'" case in tostandin()).
These two loops can be unified into simple one "for" loop.
Updating standin for newly added largefile is needed, only if same
name largefile exists in destination context at linear merging. In
such case, updated standin is used to detect divergence of largefile
at overridefilemerge().
Otherwise, standin doesn't have any responsibility for its content
(usually, it is empty).
There are some code paths, which apply standin() on same value
multilpe times instead of using already standin()-ed value.
"fstandin" is common name for "path to standin file" in lfutil.py, to
avoid shadowing "standin()".
readstandin() takes "node" argument to get changectx by "repo[node]".
There are some readstandin() invocations, which use ctx.node(),
ctx.rev(), or '.' as "node" argument above, even though corresponded
changectx object is already looked up on caller side.
This patch calls readstandin() with already known changectx itself, to
avoid meaningless re-construction of changectx (indirect case via
copytostore() is also included).
BTW, copytostore() uses "rev" argument only for readstandin()
invocation. Therefore, this patch also renames it to "revorctx" to
indicate that it can take not only revision ID or so but also
changectx, for readability.
There are many isstandin() invocations before splitstandin().
The former examines whether specified path starts with ".hglf/". The
latter returns after ".hglf/" of specified path if it starts with that
prefix, or returns None otherwise.
Therefore, value returned by splitstandin() can be used for
replacement of preceding isstandin(), and this replacement can omit
redundant string comparison after isstandin().
We will soon have matchers that don't have an _always field, so
largefiles needs to stop assuming that they do. _always is only used
by always(), so we safely replace that method instead.
The decoders were already run by default for the main repo, so this seemed like
an oversight.
The extdiff extension has been using 'archive' since a80ec1ea2694 to support -S,
and a colleague noticed that after diffing, making changes, and closing it, the
line endings were wrong for the diff-tool modified files in the subrepository.
(Files in the parent repo were correct, with the same .hgeol settings.) The
editor (Visual Studio in this case) reloads the file, but doesn't notice the EOL
change. It still adds new lines with the original EOL setting, and the file
ends up inconsistent.
Without this change, the first file `cat`d in the test prints '\r (esc)' EOL,
but the second doesn't on Windows or Linux.
Largefiles are fragile with the design where dirstate and lfdirstate must be
kept in sync.
To be less fragile, mark all clean largefiles as unsure ("normallookup") before
updating standins. After standins have been updated and we know exactly which
largefile standins actually was changed, mark the unchanged largefiles back to
clean ("normal").
This will make the failure mode more safe. If interrupted, the next command
will continue to perform extra hashing of all largefiles. That will do that all
largefiles that are out of sync with their standin will be marked dirty and
they will show up in status and can be cleaned with update --clean.
util.filechunkiter has been using a chunk size of 64k for more than 10 years,
also in years where Moore's law still was a law. It is probably ok to bump it
now and perhaps get a slight win in some cases.
Also, largefiles have been using 128k for a long time. Specifying that size
multiple times (or forgetting to do it) seems a bit stupid. Decreasing it to
64k also seems unfortunate.
Thus, we will set the default chunksize to 128k and use the default everywhere.
The default of pushing all largefiles referenced in outgoing revisions is safe,
but also expensive and sometimes not what is needed. We thus introduce a
--lfrev option, similar to what pull already has.
By specifying an empty set of revisions (or null), it is possible to get lazy
(and insecure!) pushes of revisions without referenced largefiles, similar to
how pull works.
This commit is part of bigger effort described in 'Windows UTF-8' plan.
It is not changing all invocations but the ones where change is
obviously correct and doesn't require complicated changes.
This patch consists of changes below (these can't be applied
separately).
- replace revset.extpredicate by registrar.revsetpredicate in
extensions
- remove setup() on an instance named as revsetpredicate in
uisetup()/extsetup() of each extensions
registrar.revsetpredicate doesn't have setup() API.
- put new entry for revsetpredicate into extraloaders in dispatch
This causes implicit loading predicate functions at loading
extension.
This loading mechanism requires that an extension has an instance
named as revsetpredicate, and this is reason why
largefiles/__init__.py is also changed in this patch.
Before this patch, test-revset.t tests that all decorated revset
predicates are loaded by explicit setup() at once ("all or nothing").
Now, test-revset.t tests that any revset predicate isn't loaded at
failure of loading extension, because loading itself is executed by
dispatch and it can't be controlled on extension side.
In an upcoming patch we'll have different behavior here for when 'merge
--force' is used as opposed to when other kinds of force operations are
performed, like rebases.