Commit Graph

6 Commits

Author SHA1 Message Date
FUJIWARA Katsunori
da42d55c85 largefiles: update largefiles even if rebase is aborted by conflict
Before this patch, largefiles in the working directory aren't updated
correctly, if rebase is aborted by conflict. This prevents users from
viewing appropriate largefiles while resolving conflicts.

While rebase, largefiles in the working directory are updated only at
successful committing in the special code path of
"lfilesrepo.commit()".

To update largefiles even if rebase is aborted by conflict, this patch
centralizes the logic of updating largefiles in the working directory
into the "mergeupdate" wrapping "merge.update".


This is a temporary way to fix with less changes. For fundamental
resolution of this kind of problems in the future, largefiles in the
working directory should be updated with other (normal) files
simultaneously while "merge.update" execution: maybe by hooking
"applyupdates".

"Action list based updating" introduced by hooking "applyupdates" will
also improve performance of updating, because it automatically
decreases target files to be checked.


Just after this patch, there are some improper things in "Case 0" code
path of "lfilesrepo.commit()":

  - "updatelfiles" invocation is redundant for rebase
  - detailed comment doesn't meet to rebase behavior

These will be resolved after the subsequent patch for transplant,
because this code path is shared with transplant.


Even though replacing "merge.update" in rebase extension by "hg.merge"
can also avoid this problem, this patch chooses centralizing the logic
into "mergeupdate", because:

  - "merge.update" invocation in rebase extension can't be directly
    replaced by "hg.merge", because:

    - rebase requires some extra arguments, which "hg.merge" doesn't
      take (e.g. "ancestor")

    - rebase doesn't require statistics information forcibly displayed
      in "hg.merge"

  - introducing "mergeupdate" can resolve also problem of some other
    code paths directly using "merge.update"

    largefiles in the working directory aren't updated regardless of
    the result of commands below, before this patch:

    - backout (for revisions other than the parent revision of the
      working directory without "--merge")

    - graft

    - histedit (for revisions other than the parent of the working
      directory


When "partial" is specified, "merge.update" doesn't update dirstate
entries for standins, even though standins themselves are updated.

In this case, "normallookup" should be used to mark largefiles as
"possibly dirty" forcibly, because applying "normal" on lfdirstate
treats them as "clean" unexpectedly.

This is reason why "normallookup=partial" is specified for
"lfcommands.updatelfiles".


This patch doesn't test "hg rebase --continue", because it doesn't
work correctly if largefiles in the working directory are modified
manually while resolving conflicts. This will be fixed in the next
step of refactoring for largefiles.

All changes of tests/*.t files other than test-largefiles-update.t in
this patch come from invoking "updatelfiles" not after but before
statistics output of "hg.update", "hg.clean" and "hg.merge".
2014-08-24 23:47:26 +09:00
FUJIWARA Katsunori
ea6fcbf330 largefiles: confirm existence of outgoing largefile entities in remote store
Before this patch, "hg summary" and "hg outgoing" show and count up
all largefiles changed/added in outgoing revisions, even though some
of them are already uploaded into remote store.

This patch confirms existence of outgoing largefile entities in remote
store, to show and count up only really outgoing largefile entities at
"hg summary" and "hg outgoing".
2014-07-07 18:45:46 +09:00
FUJIWARA Katsunori
86a5211ac9 largefiles: show also how many data entities are outgoing at "hg outgoing"
Before this patch, "hg outgoing --large" shows which largefiles are
changed or added in outgoing revisions only in the point of the view
of filenames.

For example, according to the list of outgoing largefiles shown in "hg
outgoing" output, users should expect that the former below costs much
more to upload outgoing largefiles than the latter.

  - outgoing revisions add a hundred largefiles, but all of them refer
    the same data entity

    in this case, only one data entity is outgoing, even though "hg
    summary" says that a hundred largefiles are outgoing.

  - a hundred outgoing revisions change only one largefile with
    distinct data

    in this case, a hundred data entities are outgoing, even though
    "hg summary" says that only one largefile is outgoing.

But the latter costs much more than the former, in fact.

This patch shows also how many data entities are outgoing at "hg
outgoing" by counting number of unique hash values for outgoing
largefiles.

When "--debug" is specified, this patch also shows what entities (in
hash) are outgoing for each largefiles listed up, for debug purpose.

In "ui.debugflag" route, "addfunc()" can append given "lfhash" to the
list "toupload[fn]" always without duplication check, because
de-duplication is already done in "_getoutgoings()".
2014-07-07 18:45:46 +09:00
FUJIWARA Katsunori
3127f4abb3 largefiles: show also how many data entities are outgoing at "hg summary"
Before this patch, "hg summary --large" shows how many largefiles are
changed or added in outgoing revisions only in the point of the view
of filenames.

For example, according to the number of outgoing largefiles shown in
"hg summary" output, users should expect that the former below costs
much more to upload outgoing largefiles than the latter.

  - outgoing revisions add a hundred largefiles, but all of them refer
    the same data entity

    in this case, only one data entity is outgoing, even though "hg
    summary" says that a hundred largefiles are outgoing.

  - a hundred outgoing revisions change only one largefile with
    distinct data

    in this case, a hundred data entities are outgoing, even though
    "hg summary" says that only one largefile is outgoing.

But the latter costs much more than the former, in fact.

This patch shows also how many data entities are outgoing at "hg
summary" by counting number of unique hash values for outgoing
largefiles.

This patch introduces "_getoutgoings" to centralize the logic
(de-duplication, too) into it for convenience of subsequent patches,
even though it is not required in "hg summary" case.
2014-07-07 18:45:46 +09:00
FUJIWARA Katsunori
5ae1757d3d largefiles: add tests for summary/outgoing improved in subsequent patches
This patch adds tests for summary/outgoing improved in subsequent
patches, to reduce amount of diffs in each patches.

This patch adds new revisions below:

  - revision #2 adds new largefiles, but they contain as same data as
    one already existing

    this causes that multiple standins refer the same data entity

  - revision #3, #4 and #5 change the already existing largefile

    this causes that multiple data entities are outgoing for the standin.
    #5 can be used to check de-duplication of "(hash, filename)" pair.
2014-07-07 18:45:46 +09:00
Pierre-Yves David
c84e682681 test: split test-largefile.t in multiple file
The `test-largefiles.t` unified test is significantly longer (about 30%) than
any other tests in the mercurial test suite. As a result, its is alway the last
test my test runner is waiting for at the end of a run.

In practice, this means that `test-largefile.t` is wasting half a minute of my
life every times I'm running the mercurial test suites. This probably mean more
a few cumulated day by now.

I've finally decided to split it up in multiple smaller tests to bring it back in
reasonable length.

This changeset extracts independent test cases in two files. One dedicated to
wire protocole testing, and another one dedicated to all other tests that could
be independently extracted.

No test case were haltered in the making of this changeset.

Various timing available below. All timing have been done on a with 90 jobs on a
64 cores machine. Similar result are shown on firefly (20 jobs on 12 core).

General timing of the whole run
--------------------------------

We see a 25% real time improvement for no significant cpu time impact.

Before split:

    real   2m1.149s
    user  58m4.662s
    sys   11m28.563s

After split:

    real   1m31.977s
    user  57m45.993s
    sys   11m33.634s


Last test to finish (using run-test.py --time)
----------------------------------------------

test-largefile.t is now finishing at the same time than other slow tests.

Before split:

    Time      Test
    119.280   test-largefiles.t
     93.995   test-mq.t
     89.897   test-subrepo.t
     86.920   test-glog.t
     85.508   test-rename-merge2.t
     83.594   test-revset.t
     79.824   test-keyword.t
     78.077   test-mq-header-date.t

After split:

    Time      Test
    90.414   test-mq.t
    88.594   test-largefiles.t
    85.363   test-subrepo.t
    81.059   test-glog.t
    78.927   test-rename-merge2.t
    78.021   test-revset.t
    77.777   test-command-template.t



Timing of largefile test themself
-----------------------------------

Running only tests prefixed with "test-largefiles".
No significant change in cumulated time.

Before:

    Time     Test
    58.673   test-largefiles.t
     2.931   test-largefiles-cache.t
     0.583   test-largefiles-small-disk.t

After:

    Time     Test
    31.754   test-largefiles.t
    17.460   test-largefiles-misc.t
     8.888   test-largefiles-wireproto.t
     2.864   test-largefiles-cache.t
     0.580   test-largefiles-small-disk.t
2014-05-16 13:18:57 -07:00