Commit Graph

118 Commits

Author SHA1 Message Date
Mads Kiilerich
afcb680fd7 largefiles: refactor usercachepath - extract user cache path function
It is convenient to have the user cache location explicitly.
2016-03-19 08:23:55 -07:00
liscju
802a1dd151 largefiles: replace invocation of os.path module by vfs in lfutil.py
Replaces invocations os.path functions to methods in vfs. Unfortunately
(in my view) this makes code less readable, because instead of using
clear variable names with path it needs to replace them with vfs(..).
I need guidance how to make such transition look more readable.

For example in this patch there is example with few places with
wvfs.join(standindir), standindir before this patch was absolute
path, in this it is changed to relative because it is used also
in expression wvfs.join(standindir, pat).
2016-03-14 20:20:22 +01:00
Anton Shestakov
f67064214c largefiles: use revisions as a ui.progress unit
Using plural form is consistent with other progress units, and "1 out of 5
revisions" sounds more correct. Also, tests don't show this, but if you have
'speed' item in progress.format config, it shows e.g. '100 revisions/sec',
which also seems better.
2016-03-11 22:26:06 +08:00
Matt Harbison
6d46368119 largefiles: prevent committing a missing largefile
Previously, if the largefile was deleted at the time of a commit, the standin
was silently not updated and its current state (possibly garbage) was recorded.
The test makes it look like this is somewhat of an edge case, but the same thing
happens when an `hg revert` followed by `rm` changes the standin.

Aside from the second invocation of this in lfutil.updatestandinsbymatch()
(which is what triggers this test case), the three other uses are guarded by
dirstate checks for added or modified, or an existence check in the filesystem.
So aborting in lfutil.updatestandins() should be safe, and will avoid silent
skips in the future if this is used elsewhere.
2016-01-24 00:10:19 -05:00
Matt Harbison
9906cb44b6 largefiles: fix an explicit largefile commit after a remove (issue4969)
The change in 6fce9a02f069 to handle a normal -> largefile switch was too
aggressive in preserving the original matcher names.  If a largefile is
explicitly provided by the user, but only the standin exists in dirstate, then
only the standin can be committed.

There's still maybe an issue when the largefile is deleted outside of Mercurial:

  $ rm large
  $ hg ci -m "oops" large
  large: The system cannot find the file specified
  nothing changed
  [1]
2016-01-23 20:51:17 -05:00
Mads Kiilerich
76651c0e10 largefiles: fix commit of missing largefiles
92117e4f6f8d improved merging of standin files referencing missing largefiles.
It did however not test or fix commits of such merges; it would abort.

To fix that, change copytostore to skip and warn about missing largefiles
with a message similar the one for failing get from remote filestores. (It
would perhaps in both cases be better to emit a more helpful warning like
"warning: standin file for large1 references 58e24f733a which can't be found in
the local store".)

To test this, make sure commit doesn't find the "missing" largefile in the global
usercache. For further testing, verify that update and status works as expected
after this.

This will also effectively backout 159c82dd6523.
2016-01-17 17:23:32 +01:00
Mads Kiilerich
24ee58b9f9 largefiles: check hash of files in the store before copying to working dir
If the store somehow got corrupted, users could end up in weird situations that
were very hard to recover from or lead to propagation of the corruption.

Instead, spend the extra time checking the hash when copying to the working
directory. If it doesn't match, emit a warning, and don't put wrong content in
the working directory.
2015-10-23 21:27:29 +02:00
Mads Kiilerich
602d83e7e6 largefiles: fix explicit commit of normal/largefile switch
Commit of corresponding normal/largefiles pairs would only commit the standin.
That is usually fine, except if either the normal file or the standin is a
remove while the other is an add. In that case it would either give duplicate
colliding entries or lose the file.

Instead, commit both filenames if one of them is a remove.
2015-10-21 00:18:11 +02:00
FUJIWARA Katsunori
54ce7de850 dirstate: show develwarn for write() invocation without transaction
This is used to detect 'dirstate.write()' invocation without the value
gotten by 'repo.currenttransaction()' (mainly focused on 3rd party
extensions).
2015-10-17 01:15:34 +09:00
Mads Kiilerich
2de7a8b7cf largefiles: better handling of merge of largefiles that not are available
Before, when merging revisions with missing largefiles, the missing largefiles
would be fetched as a part of the merge. If that failed (for example because
the main repository temporarily was unavailable), the largefile would be left
missing. However, the next commit would abort and (seemed to) fail when
markcommitted tried to mark the standin file as normal and thus had to hash the
largefile that didn't exist. (Actually, the commit would succeed but the
largefile update that follows right after the commit transaction would abort -
quite confusing.)

To fix that, make sure that synclfdirstate only marks files as normal if they
actually exist.
2015-10-12 19:22:34 +02:00
Pierre-Yves David
30913031d4 error: get Abort from 'error' instead of 'util'
The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be
confused about that and gives all the credit to 'util' instead of the
hardworking 'error'. In a spirit of equity, we break the cycle of injustice and
give back to 'error' the respect it deserves. And screw that 'util' poser.

For great justice.
2015-10-08 12:55:45 -07:00
Matt Harbison
dd27c92fee largefiles: ensure lfutil.getstandinmatcher() only matches standins
Previously, simply having the largefiles extension loaded without any largefiles
added would crash when amending with -I.  The problem was with no files in the
matcher, the pattern list of files joined with 'standindir' was empty, and
scmutil.match() would match everything.  In lfutil.composestandinmatcher(), the
match function is used to test if the file is a standin, and after getting a
false positive, proceeds to call lfutil.splitstandin().  This returns None
because it isn't a standin, which blows up when passed to rmatcher.matchfn().

Manually overriding _always in getstandinmatcher() probably isn't necessary
anymore, but we leave well enough alone on stable.  This regressed in
78632d61a993.
2015-08-12 12:26:39 -04:00
Matt Harbison
66999fb4d6 largefiles: use the optional badfn argument when building a matcher
The monkey patching in cat() can't be fixed, because it still delegates to the
original bad().  Overriding commands.cat() should go away in favor overriding
cmdutil.cat() anyway, and that matcher can be wrapped with matchmod.badmatch().
2015-06-05 22:53:15 -04:00
Martin von Zweigbergk
8714aec6c0 largefiles: avoid match.files() in conditions
See 559ee9ecae07 (match: introduce boolean prefix() method,
2014-10-28) for reasons to avoid match.files() in conditions.
2015-05-19 13:08:21 -07:00
Martin von Zweigbergk
94f4135a12 largefiles: pass in whole matcher to getstandinmatcher()
The choice between the "always" case and the other case is done in
getstandinmatcher() and the next patch will change how it's determined
based on the matcher, so let's prepare by passing in the matcher, not
just the matcher's files.
2015-05-26 11:06:43 -07:00
Martin von Zweigbergk
c576c59596 largefiles: drop unused 'pats' parameter from getstandinmatcher()
The parameter wasn't used even when it was imported from elsewhere in
7e9e4773f809 (hgext: add largefiles extension, 2011-09-24).
2015-05-26 09:46:48 -07:00
Augie Fackler
a5b17bd9d1 cleanup: use __builtins__.any instead of util.any
any() is available in all Python versions we support now.
2015-05-16 14:30:07 -04:00
Matt Harbison
673b9701b1 largefiles: use the share source as the primary local store (issue4471)
The benefit of retargeting the local store to the share source is that all
shares will always have access to the largefiles any one of them commit, even if
the user cache is deleted (which is documented to be OK to do).  Further, any
push into the source (and now any shares), will likewise make the largefile(s)
visible to all related repositories.

In order to maintain compatibility with existing repos, where the largefiles
would be cached only in the local share, fallback to searching the local share
if it isn't found at the share source.

The unshare command should probably be taught to copy the source store into the
store for the repo being unshared to complete the loop.


This patch changes the test like this:

  @@ -159,6 +159,5 @@
     $ hg share -q src share_dst --config extensions.share=
     $ hg -R share_dst update -r0
     getting changed largefiles
  -  large: largefile $HASH not available from file:///$TESTTMP\share_dst
  -  0 largefiles updated, 0 removed
  +  1 largefiles updated, 0 removed
     1 files updated, 0 files merged, 0 files removed, 0 files unresolved


The issue writeup mentions pushing a largefile from a remote repo to the main
local repo, and the largefile is then not available in any shares.  Since the
push doesn't cache the largefile in $USERCACHE, the trashed $USERCACHE in this
test is equivalent.
2015-04-04 19:06:43 -04:00
Matt Harbison
cbad853a33 largefiles: introduce lfutil.findstorepath()
The handful of direct uses of lfutil.storepath() merely need a single path to
read from or write to the largefile, whether or not it exists.  Most callers
that care about the file existing call lfutil.findfile(), in order to fallback
from the store to the user cache.

localstore._verify() doesn't call lfutil.findfile().  This prevents redirecting
the store to the share source because the largefiles for existing repos may not
be in the source's store, so verification may fail.  It can't be changed to call
findfile(), because findfile() links the file from the usercache to the local
store[1], and because it returns None instead of a path if the file doesn't
exist.

For now, this method is just a cover for lfutil.storepath(), but it will be
filled out in an upcoming patch.


[1] Maybe we shouldn't care?  But on a filesystem that doesn't support
    hardlinks, then verify will take a lot longer, and start to consume disk
    space.
2015-04-04 19:31:40 -04:00
Matt Harbison
2752441e0a largefiles: drop os.path reference in lfutil.storepath()
localrepo.join() can concatenate multiple parts on its own.
2015-04-04 15:43:00 -04:00
Matt Harbison
82c07b771f largefiles: replace 'ctx._repo' with 'ctx.repo()' 2015-03-12 23:08:16 -04:00
FUJIWARA Katsunori
d70128b84b largefiles: access to specific fields only if largefiles enabled (issue4547)
Even if largefiles extension is enabled in a repository, "repo"
object, which isn't "largefiles.reposetup()"-ed, is passed to
overridden functions in the cases below unexpectedly, because
extensions are enabled for each repositories strictly.

  (1) clone without -U:
  (2) pull with -U:
  (3) pull with --rebase:

    combination of "enabled@src", "disabled@dst" and
    "not-required@src" cause this situation.

       largefiles     requirement
    @src     @dst     @src            result
    -------- -------- --------------- --------------------
    enabled  disabled not-required    aborted unexpectedly
                      required        requirement error (intentional)
    -------- -------- --------------- --------------------
    enabled  enabled  *               success
    -------- -------- --------------- --------------------
    disabled enabled  *               success (only for "pull")
    -------- -------- --------------- --------------------
    disabled disabled not-required    success
                      required        requirement error (intentional)
    -------- -------- --------------- --------------------

  (4) update/revert with a subrepo disabling largefiles

In these cases, overridden functions cause accessing to largefiles
specific fields of not "largefiles.reposetup()"-ed "repo" object, and
execution is aborted.

  - (1), (2), (4) cause accessing to "_lfstatuswriters" in
    "getstatuswriter()" invoked via "updatelfiles()"

  - (3) causes accessing to "_lfcommithooks" in "overriderebase()"

For safe accessing to these fields, this patch examines whether passed
"repo" object is "largefiles.reposetup()"-ed or not before accessing
to them.

This patch chooses examining existence of newly introduced
"_largefilesenabled" instead of "_lfcommithooks" and
"_lfstatuswriters" directly, because the former is better name for the
generic "largefiles is enabled in this repo" mark than the latter.

In the future, all other overridden functions should avoid largefiles
specific processing for efficiency, and "_largefilesenabled" is better
also for such purpose.

BTW, "lfstatus" can't be used for such purpose, because some code
paths set it forcibly regardless of existence of it in specified
"repo" object.
2015-02-26 06:03:39 +09:00
Mads Kiilerich
f30992016d largefiles: show progress when checking standin hashes in outgoing changesets
This checking can take a huge amount of time and we should give user a hint
that something is going on.
2015-01-16 19:51:25 +01:00
Matt Harbison
11a4cb363a largefiles: ensure that the standin files are available in getlfilestoupload()
The function only adds the hash content of the file to the set to upload if the
file in the ctx is a standin.  It is called by overrides.summaryremotehook(),
which is called in the summary method.  The largefiles extension switches
'lfstatus' on in summary, so the standins shouldn't be visible when obtaining a
context there.

The reason this wasn't noticed before is that the 'lfstatus' attribute is only
being set on the unfiltered repo because of how repoview delegates attribute
assignment.  Therefore any filtered view will return a context containing
standins, whether or not 'lfstatus' was set in the various overrides methods.
That will be fixed in the next patch.  But without this change, the next patch
would have test failures for 'summary --large' stating there are no files to
upload.
2014-12-17 21:51:09 -05:00
Mads Kiilerich
b420dd92b1 spelling: fixes from proofreading of spell checker issues 2014-04-17 22:47:38 +02:00
FUJIWARA Katsunori
3f4cab3fa5 largefiles: move "copyalltostore" invocation into "markcommitted"
Before this patch, while "hg convert", largefiles avoids copying
largefiles in the working directory into the store area by combination
of setting "repo._isconverting" in "mercurialsink{before|after}" and
checking it in "copytostoreabsolute".

This avoiding is needed while "hg convert", because converting doesn't
update largefiles in the working directory.

But this implementation is not efficient, because:

  - invocation in "markcommitted" can easily ensure updating
    largefiles in the working directory

    "markcommitted" is invoked only when new revision is committed via
    "commit" of "localrepository" (= with files in the working
    directory). On the other hand, "commitctx" may be invoked directly
    for in-memory committing.

  - committing without updating the working directory (e.g. "import
    --bypass") also needs this kind of avoiding

For efficiency of this kind of avoiding, this patch does:

  - move "copyalltostore" invocation into "markcommitted"
  - remove meaningless procedures below:
    - hooking "mercurialsink{before|after}" to (un)set "repo._isconverting"
    - checking "repo._isconverting" in "copytostoreabsolute"

This patch invokes "copyalltostore" also in "_commitcontext", because
"_commitcontext" expects that largefiles in the working directory are
copied into store area after "commitctx". In this case, the working
directory is used as a kind of temporary area to write largefiles out,
even though converted revisions are committed via "commitctx" (without
updating normal files).
2014-11-08 00:48:41 +09:00
FUJIWARA Katsunori
ba0a7a0792 largefiles: update standins only at the 1st commit of "transplant --continue"
Before this patch, "hg transplant --continue" may record incorrect
standins, because largefiles extension always avoid updating standins
while transplanting, even though largefiles in the working directory
may be modified manually at the 1st commit of "hg transplant --continue".

But, on the other hand, updating standins should be avoided at
subsequent commits for efficiency reason.

To update standins only at the 1st commit of "hg transplant
--continue", this patch uses "automatedcommithook", which updates
standins by "lfutil.updatestandinsbymatch()" only at the 1st commit of
resuming.

Even after this patch, "repo._istransplanting = True" is still needed
to avoid some status report while updating largefiles in
"lfcommands.updatelfiles()".

This is reason why this patch omits not "repo._istransplanting = True"
in "overriderebase" but examination of "getattr(repo,
"_istransplanting", False)" in "updatestandinsbymatch".
2014-11-08 00:48:41 +09:00
FUJIWARA Katsunori
5e13e41d95 largefiles: avoid redundant "updatelfiles" invocation in "overridetransplant"
At "hg transplant --merge REV", largefiles newly coming from the 2nd
parent (= REV) are marked as "a"(dded) by "patch.patch()", and have to
be marked as "n"(ormal) after commit.

But until changeset 978713c45992, such largefiles were still marked as
"a" unexpectedly even after commit, because no additional entry is
added to filelog of such largefiles and they aren't listed in
"repo[newnode].files()" in this case: "newnode" is one of newly
committed changeset (= result of "repo.commit()").

"updatelfiles" invocation in "overridetransplant" shadows this problem
by forcibly synchronizing lfdirstate to dirstate.

Now, "updatelfiles" invocation in "overridetransplant" is redundant,
because changeset 978713c45992 made "markcommitted" use "ctx.files()"
to get targets of "synclfdirstate" instead of "repo[newnode].files()".
2014-11-08 00:48:38 +09:00
FUJIWARA Katsunori
c5d20c2595 largefiles: introduce "_lfstatuswriters" to customize status reporting
"lfutil.getstatuswriter" is the utility to get appropriate function to
write largefiles specific status out from "repo._lfstatuswriters".

This patch uses "stack" with an element instead of flag like
"_isXXXXing" or so, because:

  - the former works correctly even when customizations are nested, and
  - ensuring at least one element can ignore empty check
2014-11-05 23:24:47 +09:00
FUJIWARA Katsunori
381dc2e6b0 largefiles: update standins only at the 1st commit of "hg rebase --continue"
Before this patch, "hg rebase --continue" may record incorrect
standins, because largefiles extension always avoid updating standins
while rebasing, even though largefiles in the working directory may be
modified manually at the 1st commit of "hg rebase --continue".

But, on the other hand, updating standins should be avoided at
subsequent commits for efficiency reason.

To update standins only at the 1st commit of "hg rebase --continue",
this patch introduces state-full callable object
"automatedcommithook", which updates standins by
"lfutil.updatestandinsbymatch()" only at the 1st commit of resuming.

Even after this patch, "repo._isrebasing = True" is still needed to
avoid some status report while updating largefiles in
"lfcommands.updatelfiles()".

This is reason why this patch omits not "repo._isrebasing = True" in
"overriderebase" but examination of "getattr(repo, "_isrebasing",
False)" in "updatestandinsbymatch".
2014-11-05 23:24:47 +09:00
FUJIWARA Katsunori
2a694232e7 largefiles: factor out procedures to update standins for pre-committing
This patch factors out procedures to update standins for
pre-committing. This is one of preparations to avoid execution of such
procedures according to invocation context.

For example, resuming automated committing (e.g. "hg rebase
--continue") should update standins at the 1st commit, because
largefiles in the working directory may be modified manually. But on
the other hand, it should avoid updating standins at subsequent
committings for efficiency reason.

For simplicity, this patch just moves procedures mechanically only
with replacing below.

  - "self"            => "repo"
  - "lfutil."         => (none)
  - "orig" invocation => returning "match"

Using "fstandin" instead "standin" as the name of local variable for
the loop below is the only special care, because the latter shadows
the same name function in "lfutil.py".

  [before]
                for standin in standins:
                    lfile = lfutil.splitstandin(standin)
                    if lfdirstate[lfile] != 'r':
                        lfutil.updatestandin(self, standin)

  [after]
    for fstandin in standins:
        lfile = splitstandin(fstandin)
        if lfdirstate[lfile] != 'r':
            updatestandin(repo, fstandin)
2014-11-05 23:24:47 +09:00
FUJIWARA Katsunori
7b82204ba3 largefiles: factor out procedures to update lfdirstate for post-committing
Before this patch, procedures to update lfdirstate for post-committing
are scattered in "lfilesrepo.commit". In the case of "hg commit" with
patterns for target files ("Case 2"), lfdirstate is updated BEFORE
real committing.

This patch factors out procedures to update lfdirstate for
post-committing into "lfutil.markcommitted", and makes it callable via
"markcommitted" of the context passed to "lfilesrepo.commitctx".

"markcommitted" of the context is called, only when it is committed
successfully.

Passing original "markcommitted" of the context is meaningless in this
patch, but required in subsequent one to prepare something before
invocation of it.
2014-11-05 23:24:47 +09:00
Mads Kiilerich
3b22bfee79 largefiles: remove confusing rev parameter for lfdirstatestatus
Dirstate only works on the repo wctx.
2014-10-03 00:42:40 +02:00
Martin von Zweigbergk
6f453479df largefiles: access status fields by name rather than index 2014-10-03 22:10:08 -07:00
Martin von Zweigbergk
0daa605421 lfutil: avoid creating unnecessary copy of status tuple
In lfdirstatestatus(), the status tuple gets deconstructed, the lists
get updated, and then an identical status tuple gets created and
returned. Change it so we simply return the original tuple.
2014-10-03 21:21:20 -07:00
Martin von Zweigbergk
1a4e0a3d51 dirstate: separate 'lookup' status field from others
The status tuple returned from dirstate.status() has an additional
field compared to the other status tuples: lookup/unsure. This field
is just an optimization and not something most callers care about
(they want the resolved value of 'modified' or 'clean'). To prepare
for a single future status type, let's separate out the 'lookup' field
from the rest by having dirstate.status() return a pair: (lookup,
status).
2014-10-03 21:44:10 -07:00
FUJIWARA Katsunori
fa4741e97b largefiles: factor out synchronization of lfdirstate for future use 2014-08-11 22:29:43 +09:00
Matt Harbison
cbf609dee6 largefiles: avoid unnecessary creation of .hg/largefiles when opening lfdirstate
Previously, the directory '.hg/largefiles' would always be created if it didn't
exist when the lfdirstate was opened.  If there were no standin files, no
dirstate file would be created in the directory.  The end result was that
enabling the largefiles extension globally, but not explicitly adding a
largefile would result in the repository eventually sprouting this directory.

Creation of this directory effectively changes readonly operations like summary
and status into operations that require write access.  Without write access,
commands that would succeed without the extension loaded would abort with a
surprising error when the extension is loaded, but not actively used:

  $ hg sum -R /tmp/thg --config extensions.largefiles=
  parent: 16541:00dc703d5aed
   repowidget: specify incoming bundle by plain file path to avoid url parsing
  branch: default
  abort: Permission denied: '/tmp/thg/.hg/largefiles'


This change is simpler than changing the callers of openlfdirstate() to use the
'create' parameter that was introduced in 74522122b97d, and probably how that
should have been implemented in the first place.
2014-07-17 20:17:17 -04:00
Mads Kiilerich
428d9fe9e0 largefiles: fix profile of unused largefilesdirstate._ignore 2013-10-03 18:01:21 +02:00
FUJIWARA Katsunori
a3f2ae29d0 largefiles: centralize the logic to get outgoing largefiles
Before this patch, "overrides.getoutgoinglfiles()" (called by
"overrideoutgoing()" and "overridesummary()") and "lfilesrepo.push()"
implement similar logic to get outgoing largefiles separately.

This patch centralizes the logic to get outgoing largefiles in
"lfutil.getlfilestoupload()".

"lfutil.getlfilestoupload()" takes "addfunc" argument, because each
callers need different information (and it is useful for enhancement
in the future).

  - "overrides.getoutgoinglfiles()" needs only filenames
  - "lfilesrepo.push()" needs only hashes of largefiles
2014-04-16 00:37:24 +09:00
Mads Kiilerich
69da8bff75 largefiles: use repo.wwrite for writing standins (issue3909) 2013-04-27 00:41:42 +02:00
Mads Kiilerich
9c2c68a468 largefiles: don't hash all largefiles when initializing a lfdirstate
The largefiles will be hashed on demand if necessary ... and sometimes it isn't
necessary.
2013-04-15 23:31:56 +02:00
Mads Kiilerich
70924c1c48 largefiles: drop limitreader, use filechunkiter limit
filechunkiter.close was a noop.
2013-04-16 01:55:57 +02:00
Mads Kiilerich
76454d8f22 largefiles: remove blecch from lfutil.copyandhash - don't close the passed fd 2013-04-15 23:43:50 +02:00
Mads Kiilerich
d9e36e98f8 largefiles: drop lfutil.blockstream - use filechunkiter like everybody else
The old chunk size is kept - just to avoid changing it.
2013-04-15 23:43:44 +02:00
Mads Kiilerich
9fb2d6a4da largefiles: refactoring - return hex from _getfile and copyandhash 2013-04-15 23:35:18 +02:00
Mads Kiilerich
092b44d44d largefiles: refactoring - create destination dir in lfutil.link 2013-04-15 23:32:33 +02:00
Mads Kiilerich
fa7d021ee4 largefiles: drop --cache-largefiles again
This goes a step further than 974959d637b7 and backs out the unreleased
--cache-largefiles option. The same can be achieved with --lfrev heads(pulled()) and
we shouldn't introduce unnecessary command line options.
2013-04-15 01:59:11 +02:00
Mads Kiilerich
34d52d8c6e largefiles: getstandinmatcher should not depend on existence of directories
Looking for a (potentially empty) directory was not reliable - both because it
is a reasonable assumption that empty directories can be removed and because it
wasn't created in all cases ... such as when pulling to an existing repository.
2013-02-28 13:45:18 +01:00
Mads Kiilerich
c066baf66c largefiles: fix commit when using relative paths from subdirectory
Remove cwd handling from getstandinmatcher - it did not belong there, as proven
by the tests.
2013-01-25 16:59:34 +01:00