Commit Graph

39 Commits

Author SHA1 Message Date
Ryan McElroy
cc75e9e988 remotefilelog: inform user of progress while finding missing objects
Summary:
During big `hg update` calls, the user will often see no progress
initially while we discover which blobs we need to download. Let's fix that.

Reviewed By: quark-zju

Differential Revision: D6903134

fbshipit-source-id: 35b174120b6dce412dd337b6b93c9f5b4233522d
2018-04-13 21:51:21 -07:00
Durham Goode
9808c98f7e hg: use base parameter when doing background prefetch
Summary:
When background prefetch is enabled, let's use the base parameter to
limit how many files we download. This makes the operation O(files changed), and
on windows results in a significant speed up.

Reviewed By: mjpieters

Differential Revision: D7108838

fbshipit-source-id: a46b8a7d897ee204b9a4c1f1c65d875dbd3e9bc7
2018-04-13 21:51:19 -07:00
Durham Goode
73568fd07d hg: stop sending flat manifests to treeonly clients
Summary:
Historically treeonly clients were ignoring flat manifests sent in
bundles, but with a recent change they will now try to recreate those manifests
when they receive such a bundle (so we don't lose any data when a user does 'hg
unbundle' on a flat-only bundle). This means we need to stop sending flat
manifests from the server to treeonly clients.

Differential Revision: D7083028

fbshipit-source-id: e4580b00a8be96fbef0ee624529c58f41cfa2752
2018-04-13 21:51:18 -07:00
Durham Goode
a367fc0b96 hg: wrap cg packers for both clients and servers
Summary:
Previously we were only wrapping the changegroup packers for the
clients. In a future diff we want to use the shallow changegroup packers to
govern when to send trees from the server, so we need them enabled for the
server.

It turns out they were mostly enabled for the server already. While we weren't
replacing the classes in changegroup._packermap (which is what they get
instantiated from), we were replacing them via the interposeclass decorator.
This meant that the server would construct a cg3packer, but it would have a
shallowcg1packer in the class hierarchy, so most of the code was running
already. So this should be a pretty low risk change. In theory.

Differential Revision: D7083042

fbshipit-source-id: 5ce44a9ceda4d7d4bd126f52a01e45e6e1e7de40
2018-04-13 21:51:18 -07:00
Durham Goode
0cb760c0ea hg: make mutablehistorypack implement the history store api
Summary:
This makes the mutable history pack implement the history store read
api so we can add it to the union store and read the contents of things that
have been written but not yet committed.

The mutablehistorypack fileentries variable has been changed to contain a dict
instead of a list so we can access it quickly during reads. The list is from a
legacy requirement where we used to maintain the order that the writer wrote in.
We no longer do that (instead we topologically sort what they've given us), so
switching from a list to a dict should be fine.

Differential Revision: D7083036

fbshipit-source-id: ae511db60ab6432059714a2271c175dc9683b8e1
2018-04-13 21:51:18 -07:00
Durham Goode
9db9e614db hg: add add/removestore to union history store
Summary:
Adds addstore and removestore to the union history store, just like
we've already done for the union data store.

Reviewed By: ryanmce

Differential Revision: D7083055

fbshipit-source-id: 49f1a4156376d0cf5d6191c4d30ec923ddb2ec14
2018-04-13 21:51:17 -07:00
Durham Goode
75da4fb2e6 hg: add add/removeStore to cuniondatapackstore
Summary:
In a future diff we'll need the ability to modify the union store on
the fly, so let's add addstore and removestore apis.

Reviewed By: ryanmce

Differential Revision: D7051102

fbshipit-source-id: 901a50720bfdf4e5c59714d092830e65edccdfce
2018-04-13 21:51:17 -07:00
Saurabh Singh
e1194f0102 remotetreedatastore: make MissingNodesError subclass of KeyError
Summary:
The union content store

 - iterates through all the stores it has until the current store has the
   content.
 - Or, it fails eventually if none of the stores have the content.

It does so by relying on the current store throwing a KeyError if it doesn't
have the content.

`remotetreedatastore` was throwing the MissingNodesError which means any
remaining stores after it would not even get a chance to look for the content.
This commit addresses the same.

Reviewed By: ryanmce

Differential Revision: D6867854

fbshipit-source-id: 784df195efcbe16f2e716968f3d93159afff6206
2018-04-13 21:51:16 -07:00
Mateusz Kwapich
847ef8adf9 remotefilelog: error gracefuly when the file is too big
Summary:
We don't really support files larger than 2G with files larger than
4G remotefilelog crashes badly.

Reviewed By: markbt

Differential Revision: D7066855

fbshipit-source-id: 40cdebbe703a7b3f13ce84174bf6f96565e8c3b7
2018-04-13 21:51:15 -07:00
Durham Goode
4cbcaed9d9 hg: update remotefilelog cache checking to account for size
Summary:
If a remotefilelog loose object has a size of 0, it is invalid. Since
we're already doing a syscall to check if the file exists, we might as well try
to read it's size in the process and consider it missing if the size is 0.

Previously we would report the file as there, then later when we tried to use
the file we noticed it was broken and would downlad it on demand. If many files
were broken, this adhoc one-by-one downloading would be extremely slow, so
catching it earlier so we can do batch re-downloading will speed things up.

Reviewed By: singhsrb

Differential Revision: D7046960

fbshipit-source-id: 0139f7c2cad3de617e9aae925a075bdb65f70ff5
2018-04-13 21:51:13 -07:00
Rohit Yadav
4838457b17 remotefilelog: before creating filenode, check in manifest if the file exists
Summary:
When hg addremove needs to remove a file, remotefilelog tries to create a filenode in order to prefetch the removed files from the server.  If the file is not in the parent context manifest, this throws an exception.

To solve this we have to first check if the file exists or not in the parent manifest.  If the file does not exist in the manifest then we don't need to prefetch it, and addremove will behave like forget rather than remove.

Reviewed By: markbt

Differential Revision: D7009649

fbshipit-source-id: 0570bc00db546a455b9c2e4628740e24ca819dd6
2018-04-13 21:51:12 -07:00
Durham Goode
60f4968015 hg: add remotestore to history union store
Summary:
Previously we would only do ondemand downloading of tree blobs when
accessing data contents. When accessing just history we would just fail if the
data wasn't available locally. This adds the remote store to the history union
store so we can get history remotely.

Reviewed By: singhsrb

Differential Revision: D7003434

fbshipit-source-id: 839f8e84be35779ccb146d13ce3e1d6d1e7f46bd
2018-04-13 21:51:12 -07:00
Kostia Balytskyi
c17a144d34 hg: add debuglfsdownloadsize command
Summary:
This can be convenient when the size of LFS downloads between r1 and r2 is needed.

NB: this does not pay attention to sparse profiles and acts as if the checkout was full. We probably need to know how to do both.

Reviewed By: quark-zju

Differential Revision: D6938259

fbshipit-source-id: 52ab88be83339472f2eccafc746a191ff26c16c7
2018-04-13 21:51:12 -07:00
Durham Goode
5df613f729 hg: fix prefetch in treeonly mode
Summary:
remotefilelog prefetch was broken in treeonly mode since it referred to
the manifest revlog to get the parents. Let's switch to the more modern
manifestctx way of accessing parent information.

Reviewed By: quark-zju

Differential Revision: D6995267

fbshipit-source-id: e0c11fca0f2156be3f936a6e437e7a4d3dffe75b
2018-04-13 21:51:12 -07:00
Durham Goode
7b90d5d33a hg: make mutabledatapack implement data API
Summary:
As part of producing only one pack file per transaction, we need to
change the mutabledatapack store to allow reads of written-but-not-finalized
data. This will allow us to add the mutabledatapack to the union store, and then
keep it alive for the duration of the transaction.

Reviewed By: quark-zju

Differential Revision: D6944348

fbshipit-source-id: 1e721bd8e07335a9c1f9c6b7595a765ec018c007
2018-04-13 21:51:11 -07:00
Durham Goode
47fcb59147 hg: move datapack readentry function out
Summary:
In an upcoming diff we'll make mutabledatapack readable, so we need the
entry parsing logic to be accessable. Let's move it out.

Reviewed By: singhsrb

Differential Revision: D6944345

fbshipit-source-id: c75162bb7c4fa47c6339b7ebbf96a4e386bd04b3
2018-04-13 21:51:10 -07:00
Jun Wu
f1c575a099 flake8: enable F821 check
Summary:
This check is useful and detects real errors (ex. fbconduit).  Unfortunately
`arc lint` will run it with both py2 and py3 so a lot of py2 builtins will
still be warned.

I didn't find a clean way to disable py3 check. So this diff tries to fix them.
For `xrange`, the change was done by a script:

```
import sys
import redbaron

headertypes = {'comment', 'endl', 'from_import', 'import', 'string',
               'assignment', 'atomtrailers'}

xrangefix = '''try:
    xrange(0)
except NameError:
    xrange = range

'''

def isxrange(x):
    try:
        return x[0].value == 'xrange'
    except Exception:
        return False

def main(argv):
    for i, path in enumerate(argv):
        print('(%d/%d) scanning %s' % (i + 1, len(argv), path))
        content = open(path).read()
        try:
            red = redbaron.RedBaron(content)
        except Exception:
            print('  warning: failed to parse')
            continue
        hasxrange = red.find('atomtrailersnode', value=isxrange)
        hasxrangefix = 'xrange = range' in content
        if hasxrangefix or not hasxrange:
            print('  no need to change')
            continue

        # find a place to insert the compatibility  statement
        changed = False
        for node in red:
            if node.type in headertypes:
                continue
            # node.insert_before is an easier API, but it has bugs changing
            # other "finally" and "except" positions. So do the insert
            # manually.
            # # node.insert_before(xrangefix)
            line = node.absolute_bounding_box.top_left.line - 1
            lines = content.splitlines(1)
            content = ''.join(lines[:line]) + xrangefix + ''.join(lines[line:])
            changed = True
            break

        if changed:
            # "content" is faster than "red.dumps()"
            open(path, 'w').write(content)
            print('  updated')

if __name__ == "__main__":
    sys.exit(main(sys.argv[1:]))
```

For other py2 builtins that do not have a py3 equivalent, some `# noqa`
were added as a workaround for now.

Reviewed By: DurhamG

Differential Revision: D6934535

fbshipit-source-id: 546b62830af144bc8b46788d2e0fd00496838939
2018-04-13 21:51:09 -07:00
Kostia Balytskyi
6f179126c7 hg: on Windows, use mmap with explicit ACCESS_READ argument passed
Summary:
On Windows, the intended access to the file handle must align with the
intended access to the memory mapping object, see [1].

When called without and argument, Python's `mmap.mmap` on Windows assumes
`ACCESS_WRITE` mode[3], therefore we get failures like [2] in Lego Windows.

[1]
https://msdn.microsoft.com/en-us/library/windows/desktop/aa366537(v=vs.85).aspx

[2] P59034213

[3] https://docs.python.org/2/library/mmap.html

Reviewed By: quark-zju

Differential Revision: D6952583

fbshipit-source-id: 93159f8282e27d3e62d859f4c220e7c3bdfbe958
2018-04-13 21:51:09 -07:00
Jun Wu
2946a1c198 codemod: use single blank line
Summary: This makes test-check-code cleaner.

Reviewed By: ryanmce

Differential Revision: D6937934

fbshipit-source-id: 8f92bc32f75b9792ac67db77bb3a8756b37fa941
2018-04-13 21:51:08 -07:00
Jun Wu
49cbfb1878 filelog: allow trading file history correctness for performance
Summary:
With certain setup, the file history could be incorrect. That makes
`adjustlinknode` much slower since it has to scan the full history.
It is extremely slow if there is a commit with a massive renames.

This is undesirable since `hg log FILE` would give a wrong result.
But it does make the repo usable. Automation that does not care
about `hg log FILE` correctness can probably enable this relatively
safely.

This patch changes both shallow and full repos.

Reviewed By: DurhamG

Differential Revision: D6912051

fbshipit-source-id: 23d6f6c8dd91d4f72b43bc560cf26686bd6c4b47
2018-04-13 21:51:06 -07:00
Jun Wu
efc6fe7319 remotefilelog: disallow delta on copied revisions
Summary:
The remotefilelog cgunpacker logic could enter an infinite loop
if a copyfrom file node uses its copyto as delta base.

This repros without LFS in test-lfs-bundle.t.

Reviewed By: DurhamG

Differential Revision: D6910079

fbshipit-source-id: 99fea316e77218cd4bc9ea6f5506779a3e4ab9a6
2018-04-13 21:51:06 -07:00
Jun Wu
19c474492b remotefilelog: respect lfs copy metadata
Summary: Otherwise the copy data will be lost when applying LFS bundles.

Reviewed By: DurhamG

Differential Revision: D6906207

fbshipit-source-id: bc94c6614f9d4b2a2b4c7f44f57de49bd54d6b49
2018-04-13 21:51:05 -07:00
Jun Wu
c223bb4eaf remotefilelog: resolve lfs rawtext to vanilla rawtext before applying delta
Summary: This is similar to the previous patch, but applies to remotefilelog.

Reviewed By: DurhamG

Differential Revision: D6906206

fbshipit-source-id: 2a9a56a57544b5e4d892f77438b2faaadece73ee
2018-04-13 21:51:05 -07:00
Jun Wu
a226ef4969 revlog: forbid revdiff revisions with non-zero flags
Summary:
Calling revdiff with non-zero flags is a sign of a hard-to-debug
error. Raise ProgrammingError in this case.

The change is straightforward. Apply it to both shallow and full
repos.

Reviewed By: DurhamG

Differential Revision: D6910080

fbshipit-source-id: cbcf1a444de90e104867cc9f1525629b7edda851
2018-04-13 21:51:05 -07:00
Jun Wu
1be09a10bc remotefilelog: do not delta lfs revisions
Summary: This is similar to the previous patch, but applies to remotefilelog.

Reviewed By: DurhamG

Differential Revision: D6906212

fbshipit-source-id: 30383632046f57b169dcb8a2ba1c0dd73113154a
2018-04-13 21:51:05 -07:00
Jun Wu
6529990478 debugfilerevision: add a new debug command
Summary:
This is similar to `debugdata`, but instead of taking a file revision (or
file node in remotefilelog's case), it takes a revset.

This is more useful practically, since the user would know commit hashes
easily but file nodes are hidden from the UI.

This is intended to make it easier to investigate LFS contents.

Reviewed By: DurhamG, ryanmce

Differential Revision: D6891770

fbshipit-source-id: 415da9b773c30830a48c09eda9f1854c416e3222
2018-04-13 21:51:05 -07:00
Jun Wu
f9ba760920 lz4: fix "import lz4.block" compatibility with demandimport
Summary:
`import lz4.block` is incompatible with demandimport. Let's use
`from lz4 import block` instead.

Reviewed By: DurhamG

Differential Revision: D6908162

fbshipit-source-id: 37119e21f7b289f89e41ad04fbb7f1ec81181259
2018-04-13 21:51:04 -07:00
Kostia Balytskyi
8bd5930877 hg: try to make remotefilelog and gc more shared-handle-friendly
Summary:
We have seen bad `hg gc` behavior on Lego-Windows, caused evidently by the fact
that we currently use plain Python `open` and `os.unlink` instead of
Mercurial's advanced `posixfile` and `unlink`.

Reviewed By: DurhamG

Differential Revision: D6900689

fbshipit-source-id: cd7ebbbb734a6163d062622d1d4606fad43c91ac
2018-04-13 21:51:03 -07:00
Phil Cohen
0584f5d23f hg: fastverify: unify and fold into core
Summary: `fastverifier` was sometimes being overriden by `shallowverifier` when remotefilelog was enabled. Since the latter is a subset of the former, let's just fold both into the core verifier code backed by a config, `verify.skipmanifests`, that we can default to true.

Reviewed By: DurhamG

Differential Revision: D6882222

fbshipit-source-id: 9f337ca031a070425ccdc9ee02f6765e68436da9
2018-04-13 21:51:03 -07:00
Jun Wu
cc33c003c2 lz4: import lz4.block
Summary:
`lz4.block` needs to be imported explicitly before being able to
use `lz4.block.compress`.

We didn't notice this because we're using an old version of
`python-lz4`.

Reviewed By: DurhamG

Differential Revision: D6879877

fbshipit-source-id: 37e8fdc00386bef3733753f925ad308f42e5a740
2018-04-13 21:51:01 -07:00
Durham Goode
01b34be972 hg: prefetch trees before producing changegroup
Summary:
When building a changegroup, sometimes we need to access the list of
files changed by each commit. To do so we need to inspect the manifest.
Previously this would end up downloading each tree one-by-one, producing a bunch
of pack files. With this patch we now do one bulk download at the very
beginning.

Reviewed By: quark-zju

Differential Revision: D6873076

fbshipit-source-id: b916c15efca0770129340f798d3e7b165da6aec9
2018-04-13 21:51:01 -07:00
Jun Wu
e2a5493b04 basepack: workaround Python's mmap fd limit
Summary:
This is a resend of https://phab.mercurial-scm.org/D1430, without breaking
Windows.

I encountered "too many opened files" problem due to treemanifest packs on my
laptop. This patch seems to be the easiest solution without side effects. Other
choices are deleting files (seem like an non-ideal workaround), forcing a
repack (could be slow), and rewriting using Rust (could take too long).

The root cause is Python's `mmap` implementation has to keep a fd internally
to support `mmapobj.resize` API. We only need read-only operation on the
mmap object so the fd is unnecessary. Re-implement a minimal mmap interface
for this purpose.

Reviewed By: DurhamG

Differential Revision: D6835890

fbshipit-source-id: 74c429e957cb8677682604eb02fc38b5b8d13ef7
2018-04-13 21:51:00 -07:00
Jun Wu
d25885950e remotefilelog: fix unbundle with lfs revisions
Summary:
unbundle should use raw revisions and keep flags as-is,
instead of using high-level `filelog.add` API.

Reviewed By: ryanmce

Differential Revision: D6806031

fbshipit-source-id: 3e1819a91ee869ac8023eefb3f3aa7542f770539
2018-04-13 21:50:57 -07:00
Phil Cohen
45c4a072f9 hgext: use relative imports wherever possible
Summary:
Port of D6798134 to fbsource. It eliminates module-import failures as well as errors like this:

```
mercurial.error.ForeignImportError: hgext.extlib.treedirstate: /home/phillco/.local/lib/python2.7/site-packages/hgext/extlib/treedirstate.so lives outside /..../hg
```

....that block other tests, like test-help.t

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju

Differential Revision: D6799259

fbshipit-source-id: b77d1b565dbf52165e0847002be498648658e064
2018-04-13 21:50:56 -07:00
Durham Goode
3cc56d6007 imports: fix imports to refer to hgext 2018-01-09 15:23:52 -08:00
Durham Goode
8103079702 imports: import from hgext instead of hgext3rd
The only reason these worked is because they were in the system python path.
Which means the in-repo code wasn't actually being tested.
2018-01-09 15:23:52 -08:00
Ryan McElroy
dc858619e0 help: improve extension help messages
Test Plan: run-tests.py

Reviewers: ikostia, #mercurial

Reviewed By: ikostia

Differential Revision: https://phabricator.intern.facebook.com/D6683060

Signature: 6683060:1515505954:91bdc8841c2168bf93e7448cb0fa4d136d7a6e2f
2018-01-09 05:52:58 -08:00
Kostia Balytskyi
0e4d95b67e fb-hgext: fix gendoc-related issues
Summary:
These fixes are related to documentation-related check-style tests.

Depends on D6675344

Test Plan: - more tests pass

Reviewers: #sourcecontrol

Differential Revision: https://phabricator.intern.facebook.com/D6675351
2018-01-09 03:44:33 -08:00
Durham Goode
fe980ff373 remotefilelog: move to hgext/
Summary:
Moves the remotefilelog extension into hgext/ and it's tests into
tests/.

I did not fix up all the check-module errors, since it's a ton of work for
very little impact at this point.

Test Plan: make local && ./run-tests.py

Reviewers: #mercurial

Differential Revision: https://phabricator.intern.facebook.com/D6680030
2018-01-08 18:58:08 -08:00