Commit Graph

425 Commits

Author SHA1 Message Date
Tony Tung
c299d186f4 [cdatapack] implement _find()
Summary: Depends on D3660087, D3654810

Test Plan:
```
#!/usr/bin/env python

import binascii

import cdatapack

a = cdatapack.datapack('d864669a5651d04505ec6e5e9dba1319cde71f7b')

bin = binascii.unhexlify('f2e53f83c5dc806aa2eda87bb15fe0367baf3a7e')
print a._find(bin)
```

yields:

```
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:eaf5d75> python foo.py
('\xf2\xe5?\x83\xc5\xdc\x80j\xa2\xed\xa8{\xb1_\xe06{\xaf:~', 4294967295, 285122348L, 8374L)
```

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3660091

Signature: t1:3660091:1470339492:21a7f3067bda7822c8f396120f99f1dc6e4e26b5
2016-08-04 13:49:19 -07:00
Tony Tung
9b036e561e [cdatapack] expose the find interface
Summary: Needed if we want to do a hybrid implementation of cdatapack

Test Plan: used in following diff.

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3660087

Signature: t1:3660087:1470339373:4e8b548f1509af7f34d0a4bf8bd85723f38d238d
2016-08-04 13:48:32 -07:00
Tony Tung
113f23b65e [cdatapack] implement iteritems()
Summary: `iteritems()` differs from `__iter__()` slightly in that it yields the delta base and delta.

Test Plan:
run this toy program

```
#!/usr/bin/env python

import cdatapack

a = cdatapack.datapack('d864669a5651d04505ec6e5e9dba1319cde71f7b')
for x in a.iterentries():
    print x[0], repr(x[1]), repr(x[2]), len(x[3])
for x in a:
    print x[0], repr(x[1])
```

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3659133

Signature: t1:3659133:1470339839:dbdce5990a30ffe019ccc44fce97925b64524acd
2016-08-04 13:48:21 -07:00
Tony Tung
a28137668e [cdatapack] skeleton for iterator type
Summary: cdatapack now has a getiter function, and it returns a cdatapack_iterator.

Test Plan:
using this toy program, dumped a pack file.

```
#!/usr/bin/env python

import cdatapack

a = cdatapack.datapack('d864669a5651d04505ec6e5e9dba1319cde71f7b')
for x in a:
    print x
```

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: durham, mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3659005

Signature: t1:3659005:1470339657:aa39cc57a669b9bc4604933ce35ed20b3f81b468
2016-08-04 13:48:04 -07:00
Tony Tung
e8da5b62df [cdatapack] fix build on linux hosts
Summary:
1. Get ntohl from arpa/inet.h as per the posix spec
2. Get ntohll from endian.h's be64toh

Test Plan: make local

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3671211

Signature: t1:3671211:1470341382:e6b0fe12094246aeb6be09252122bde9680e4599
2016-08-04 13:23:11 -07:00
Tony Tung
7b70d8c572 [cdatapack] skeleton for the python type
Summary:
This is the skeleton for the python type.  Only initialization and the destructor are filled in.

Depends on D3654786.

Test Plan:
```
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:0478c29> ls -l d864669a5651d04505ec6e5e9dba1319cde71f7b*
-r--r--r--  1 tonytung  staff     947666 Jul 26 14:08 d864669a5651d04505ec6e5e9dba1319cde71f7b.dataidx
-r--r--r--  1 tonytung  staff  285130722 Jul 26 14:08 d864669a5651d04505ec6e5e9dba1319cde71f7b.datapack
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:0478c29> cat foo.py
#!/usr/bin/env python

import cdatapack

a = cdatapack.datapack('d864669a5651d04505ec6e5e9dba1319cde71f7b')
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:0478c29> python foo.py
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:0478c29>
```

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3654810

Signature: t1:3654810:1470175451:c2d4e4acc138685c1030ed98f0afd9379f9fa0c4
2016-08-03 15:29:44 -07:00
Tony Tung
76f5986ab9 [cdatapack] adds an initial cpython file for the cdatapack implementation
Summary: Just a simple module declaration with no logic yet.

Test Plan:
```
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:2445a3a> make local

<output snipped>

[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:2445a3a> python
Python 2.7.11 (default, Mar  1 2016, 18:40:10)
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import cdatapack
>>>
```

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3654786

Signature: t1:3654786:1470175354:c7e8847dcc74c83483d21888ad30cd9242fb461c
2016-08-03 15:29:29 -07:00
Tony Tung
58e395d2e6 [cdatapack] utility to retrieve and checksum the delta chain
Summary:
Given a node sha, find it in the index file and retrieve the deltas.  Checksum the data and dump it.

Depends on D3637000, D3636945

Test Plan:
```
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:68cd351> /Users/tonytung/Library/Caches/CLion2016.2/cmake/generated/cdatapack-64b7828e/64b7828e/Debug0/cdatapack_get  d864669a5651d04505ec6e5e9dba1319cde71f7b  f2e53f83c5dc806aa2eda87bb15fe0367baf3a7e

source/zippydb/tier_spec/tier_settings/zippydb.wildcard.tmpfs.zippydb_settings.cconf
Node                                      Delta Base                                Delta SHA1                                Delta Length
f2e53f83c5dc806aa2eda87bb15fe0367baf3a7e  0000000000000000000000000000000000000000  f32b366a6c44430df6526133f82f9638426ba9c5  37769
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:68cd351> hg debugdatapack d864669a5651d04505ec6e5e9dba1319cde71f7b --node f2e53f83c5dc806aa2eda87bb15fe0367baf3a7e

source/zippydb/tier_spec/tier_settings/zippydb.wildcard.tmpfs.zippydb_settings.cconf
Node                                      Delta Base                                Delta SHA1                                Delta Length
f2e53f83c5dc806aa2eda87bb15fe0367baf3a7e  0000000000000000000000000000000000000000  f32b366a6c44430df6526133f82f9638426ba9c5  37769
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:68cd351>
```

Reviewers: durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3637416

Signature: t1:3637416:1470094723:bce7e903cd0b80c293e16b7532c49e552d3039ef
2016-08-03 15:29:01 -07:00
Tony Tung
8a52eb5dfb [check-code] fix check-code for cdatapack
Summary: self-explanatory

Test Plan: pass `test-check-code-hg.t`

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3656834

Signature: t1:3656834:1470175484:11e521a82e9eec9048c53a73f3fecc831dc9dd78
2016-08-02 15:37:08 -07:00
Tony Tung
11daaa8784 [repack] fix debugdatapack test output to reflect the uncompressed data
Summary: Now that we report uncompressed lengths, the test output needs to be updated.

Test Plan: pass `PYTHONPATH=~/work/mercurial/facebook-hg-rpms/fb-hgext/:~/work/mercurial/facebook-hg-rpms/lz4revlog/:~/work/mercurial/facebook-hg-rpms/remotefilelog/  python ~/work/mercurial/facebook-hg-rpms/hg-crew/tests/run-tests.py -j32 test-repack.t`

Reviewers: #fastmanifest, durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3656796

Signature: t1:3656796:1470175464:56bf12516710cef8e8aaa7e7b3e0dbdfa220d797
2016-08-02 15:36:56 -07:00
Olivier Trempe
e95e1526b8 flogheads: add a test 2016-08-02 10:42:00 -07:00
Olivier Trempe
af90628992 flogheads: return an empty list when requesting heads of a non-existing filelog 2016-08-02 10:41:41 -07:00
Olivier Trempe
8005fc4b2f fileserverclient: add wireproto command for requesting a filelog's heads
Allowing discovery of all the heads of a filelog allows supporting some existing
Mercurial use cases, like viewing all the versions of a file in a UI.
2016-08-02 10:40:42 -07:00
Olivier Trempe
51e02acf2c filelogrevset: Return revset.baseset instead of plain list. Add test for kind in path. 2016-08-02 09:40:50 -07:00
Tony Tung
974339f97e [cdatapack] fix index retrieval bugs
Summary:
1. offsets are absolute byte offsets.  convert them to entry offsets to make the bisect code a lot simpler.
2. when writing entries to pack chain, we need to advance the pointer.

Depends on D3627122

Test Plan: used in later diff.

Reviewers: durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3637000

Signature: t1:3637000:1469741885:c2416a3b30e5bb2b64e6bb7062f4c02098be91eb
2016-08-01 14:18:35 -07:00
Tony Tung
641c5f4f01 [debugcommands] return the uncompressed delta length when iterating over a datapack
Summary:
When retrieving a delta chain, datapack.py uncompresses the delta chain data.  However, when iterating over the datapack, we get the compressed length.  THis is not desirable as the output is no longer consistent.  This diff peeks into the lz4 header to get the uncompressed length when iterating.

Depends on D3627119

Test Plan:
```
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:e2ef218> hg debugdatapack d864669a5651d04505ec6e5e9dba1319cde71f7b --node f2e53f83c5dc806aa2eda87bb15fe0367baf3a7e

source/zippydb/tier_spec/tier_settings/zippydb.wildcard.tmpfs.zippydb_settings.cconf
Node                                      Delta Base                                Delta SHA1                                Delta Length
f2e53f83c5dc806aa2eda87bb15fe0367baf3a7e  0000000000000000000000000000000000000000  f32b366a6c44430df6526133f82f9638426ba9c5  37769
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:e2ef218> hg debugdatapack d864669a5651d04505ec6e5e9dba1319cde71f7b | tail -n 4

source/zippydb/tier_spec/tier_settings/zippydb.wildcard.tmpfs.zippydb_settings.cconf
Node          Delta Base    Delta Length
f2e53f83c5dc  000000000000  37769
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:e2ef218>
```

Reviewers: durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3636945

Signature: t1:3636945:1469811243:b21d90d9599244ed4600c5336818b9a18eacf3ff
2016-08-01 14:16:29 -07:00
Tony Tung
20126b3bd8 [cdatapack] fix memory handling for cdatapack
Summary:
`->index_table` is not heap-alloacted.  however, `->fanout_table` is and should be released.

Also added call to `close_datapack()` at the end of `cdatapack_dump.c`.

Depends on D3627122

Test Plan: valgrind is much happier now.

Reviewers: durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3631368

Signature: t1:3631368:1469741779:e0c4e5d59c7e73c8aa3507901df3005383f0d3f5
2016-08-01 14:11:16 -07:00
Tony Tung
705c0731b6 [remotefilelog] initial checkin of a c datapack parser
Summary: This is not yet complete, but seems to be able to parse a data file.

Test Plan:
`/Users/tonytung/Library/Caches/CLion2016.2/cmake/generated/cdatapack-64b7828e/64b7828e/Debug/cdatapack_dump d864669a5651d04505ec6e5e9dba1319cde71f7b > /tmp/2`

compare it with the output of `hg debugdatapack --long d864669a5651d04505ec6e5e9dba1319cde71f7b > /tmp/1`

and it exactly matches.

Reviewers: durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3627122

Signature: t1:3627122:1470085301:c9b9e8b2fa57bb7a09dd56d3c811ff8eadbb85ba
2016-08-01 14:05:37 -07:00
Tony Tung
9e557758b0 [datapack] add --node as a parameter to dump extra data about a node
Summary:
It obtains the deltachain and dumps the chain to the console.

Depends on D3627117.

Test Plan:
```
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:3266095> hg debugdatapack d864669a5651d04505ec6e5e9dba1319cde71f7b --node ba5fbf1aba48f25d46228626917b2705adc9e7c8

arcanist/__phutil_library_map__.php
Node                                      Delta Base                                Delta SHA1                                Delta Length
ba5fbf1aba48f25d46228626917b2705adc9e7c8  0000000000000000000000000000000000000000  df442a6f976b946c266f76b0f63a198e8aabf809  3993
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:3266095>
```

Reviewers: durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3627119

Signature: t1:3627119:1469738313:d61726585a020ed4cbabbb1f623eb202ccd51b9f
2016-07-28 17:15:21 -07:00
Tony Tung
7111de0994 [datapack] allow for long hashes to be printed
Summary:
This will help verify the C datapack reader.

Depends on D3627112.

Test Plan:
```
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:4063a65> hg debugdatapack --long  d864669a5651d04505ec6e5e9dba1319cde71f7b  | head -n 15

arcanist/__phutil_library_map__.php
Node                                      Delta Base                                Delta Length
ba5fbf1aba48f25d46228626917b2705adc9e7c8  0000000000000000000000000000000000000000  1265

arcanist/canary/FacebookConfigeratorArcanistCanaryWorkflow.php
Node                                      Delta Base                                Delta Length
142f9991fca1a16c6544cb6e5a0071296e712268  0000000000000000000000000000000000000000  6546

arcanist/lint/FacebookConfigeratorLintEngine.php
Node                                      Delta Base                                Delta Length
c8630501c45f1bc1dc47df2ee2ad354993438cdb  0000000000000000000000000000000000000000  2811

arcanist/lint/linter/FbcodePyFlake8Linter.php
Node                                      Delta Base                                Delta Length
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:4063a65> hg debugdatapack   d864669a5651d04505ec6e5e9dba1319cde71f7b  | head -n 15

arcanist/__phutil_library_map__.php
Node          Delta Base    Delta Length
ba5fbf1aba48  000000000000  1265

arcanist/canary/FacebookConfigeratorArcanistCanaryWorkflow.php
Node          Delta Base    Delta Length
142f9991fca1  000000000000  6546

arcanist/lint/FacebookConfigeratorLintEngine.php
Node          Delta Base    Delta Length
c8630501c45f  000000000000  2811

arcanist/lint/linter/FbcodePyFlake8Linter.php
Node          Delta Base    Delta Length
[andromeda]:~/work/mercurial/facebook-hg-rpms/remotefilelog:4063a65>
```

Reviewers: durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3627117

Signature: t1:3627117:1469735318:103e9a21be082749332572c9c4f9942ea9c1c248
2016-07-28 17:07:04 -07:00
Tony Tung
3da845968f [datapack] fix computation of the paged-in size
Summary:
It should include the filelen and the deltalen fields, which are
2 and 8 bytes.

Test Plan: visual.

Reviewers: durham

Reviewed By: durham

Subscribers: mitrandir

Differential Revision: https://phabricator.intern.facebook.com/D3627108

Signature: t1:3627108:1469742083:ffb59768906d9e5463065eec92e1c80cc8482884
2016-07-28 17:06:50 -07:00
Olivier Trempe
a2ce732706 fileserverclient: fixed lingering ssh connection due to reference cycle on pull operations
Calling wrapfunction on the remotefilepeer(sshpeer) object in exchangepull
function introduces a reference cycle. Hence, this object will not be deleted
until the process dies. This is not a big issue for processes having a short
lifetime(e.g. lauched by command line.)
However, for persistent processes (e.g. TortoiseHg), this can lead to multiple
lingering ssh connections to the server(actually one by pull operation).

The fix is to not wrap the remotefilepeer._callstream. This method is defined
right into the remotefilepeer object. The required repo data is made available
in the remotefilepeer object by monkeypatching this object in the exchangepull
function.
2016-07-22 13:47:02 -07:00
Olivier Trempe
0368ca40fc Fix filelogrevset not properly handling "kind" in path 2016-07-22 13:09:48 -07:00
Durham Goode
df65096278 pull: add more requirement checking
In some situations the remotefilelog setup logic could be called, which will
wrap certain functions, and then later a call will happen to a repo that wasn't
remotefilelog which will run some remotefilelog code because of the wrapping.

Normally we take care of this by checking for the remotefilelog requirement. We
missed it in this one spot though.
2016-07-22 12:33:56 -07:00
Martin von Zweigbergk
0df928d828 shallowbundle: specifically compare instance to remotefilelog.remotefilelog
In two place, we were checking if a revlog was an instance of
revlog.revlog and, I think, treating it as a
remotefilelog.remotefilelog otherwise. I noticed this when I created
another non-revlog.revlog revlog in narrowhg and remotefilelog thought
it was a remotefilelog.remotefilelog. Let's specifically check if it's
a remotefilelog.remotefilelog instead.
2016-07-15 23:53:09 -07:00
Durham Goode
073f8e3d22 repack: unmap memory occasionally to reclaim space
Summary:
When running large repack operations, the resident size of the process
could become quite large, since we're scanning in entire pack files. Linux/OSX
have api calls for telling the kernel it's ok to release some of that memory,
but those apis are not exposed to python.

So instead, let's unmap and remap the mmap's once a certain amount of data has
been read. I also tried changing the mmap accessors to use the file oriented api
(mmap.read(), mmap.seek(), etc) so we could switch to actual file handles during
repack, but it had a drastic affect on normal performance (repack took 1 hour
instead of a few minutes).

Long term we should move all of this logic to c++ so we can use the more
powerful APIs.

Test Plan:
Did a full repack on a laptop and verified memory capped out at 2GB
instead of exceeding 5GB.

Reviewers: #sourcecontrol, ttung

Differential Revision: https://phabricator.intern.facebook.com/D3545171
2016-07-12 11:46:48 -07:00
Durham Goode
77192943d4 repack: handle race condition with background repacks
Summary:
There was a race condition where if a repack is running and another hg process
launches, the new process will only see the original packs, and not any of the
new packs (even though the source blobs are being deleted from disk by the
repack).

The fix is to allow our pack store to refresh it's list of packs every so often.
In this particular implementation we do it at most every 100ms. A more robust
strategy would be to group key misses and only check for new packs at the end
once we have a list of all the misses, but this would require significant
refactoring to make everything grouped. This case should only ever happen during
repacks, so it should almost never occur more than once during a command, so the
100ms version is probably good enough.

Test Plan:
Ran `hg up && hg pull && sleep 0.2 && hg up master` in a loop with a
break point in the refresh code and caught it executing in a situation where the
background repack had removed the original sources and put them in a new pack.
Verified that it loaded the data from the new pack correctly.

Reviewers: #mercurial, ttung, lcharignon

Reviewed By: lcharignon

Subscribers: lcharignon

Differential Revision: https://phabricator.intern.facebook.com/D3524314

Signature: t1:3524314:1467907680:85be07ad953811000c468852eb0626f4d8b53a13
2016-07-07 15:59:06 -07:00
Durham Goode
15fcba5c21 cachegroup: fix directory permissions for shared cache
Summary:
The shared cache needs to be completely g+ws so that all members of the group
can write to each directory in it. The old code only applied g+ws to the leaf
directories, so other users aren't able to write to non leaf directories (like
hgcache/7a/83beca8.../ others couldn't write to 7a/)

Test Plan:
Updated a test to view group permissions for the intermediate
directories

Reviewers: #mercurial, ttung, simpkins

Reviewed By: simpkins

Subscribers: lcharignon, net-systems-diffs@, simpkins, mbolin

Differential Revision: https://phabricator.intern.facebook.com/D3523918

Signature: t1:3523918:1467930221:452b11b56a2e69896bf8d2cd0acd7131b41f90d8
2016-07-07 15:58:59 -07:00
Durham Goode
7c44b94bb0 repack: fix repack heuristic to account for unusual copies
Summary:
Previously, the history repack logic would stop traversing history for a given
filename once it encountered a rename. This isn't quite right, since the history
could eventually be traversed back to the original file, where we'd need to
continue processing. So now we check for when the copyfrom becomes the filename.

Also, if the copy source file and the copy target file have two nodes with the
same value, we would not process the one in the copy target (since it was marked
do not process). We fix this by explicitly checking if the node is one of the
known entries in the file being processed.

Test Plan: Added a test

Reviewers: #mercurial, ttung, mitrandir, rmcelroy

Reviewed By: mitrandir, rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3523215

Signature: t1:3523215:1467828169:bd487c8f296352c1a1b9355cb55f9001bd5e19a9
2016-07-07 15:58:47 -07:00
Martin von Zweigbergk
adcdb9289c commands: tell @command decorator about arguments
Before this patch, debugremotefilelog and verifyremotefilelog would
crash if not given a path. Also, many commands would accept arguments
they then ignored.
2016-06-30 10:14:17 -07:00
Martin von Zweigbergk
c9390fde26 debugdatapack: make function name match command 2016-06-30 10:11:37 -07:00
Olivier Trempe
d8d662c766 shallowutil: windows compatibility for readonly files
On Windows, os.rename cannot rename readonly files and cannot overwrite
destination if it already exists. Create small wrappers to handle these cases.
2016-07-05 00:30:42 -07:00
Durham Goode
0b111a5610 basestore: fix incorrect variable name 2016-06-22 15:55:38 -07:00
Laurent Charignon
af2ffcc620 check-code: add test-bad-config to check code blacklist 2016-06-20 16:29:52 -07:00
Laurent Charignon
9a1fb623cc shallowutil: add missing import
Summary:
Before this patch, we were not importing mercurial.error, this was
causing a crash when calling error.Abort. This patch adds the missing import.

Test Plan: Tests pass, and add a new test

Reviewers: durham

Differential Revision: https://phabricator.intern.facebook.com/D3457086
2016-06-20 15:18:14 -07:00
Durham Goode
b78655b899 tests: attempt to fix more test flakiness
In this race condition test, occasionally the second invocation would actually
obtain the lock before the first. This meant that the first repack would fail
with an error message while the second would exit with 0, resulting in the test
output changing slightly. Let's introduce a slight delay before the second
invocation to prevent this from happening.
2016-06-19 19:05:55 -07:00
Durham Goode
972e35e2ba tests: fix source of flakey tests
These should've been globs to begin with.
2016-06-19 18:40:46 -07:00
Durham Goode
d7722fcc7c stores: reverse order of cache and local stores
In the old days we would check the cache first, then the local store. This was
important because the cache is more likely to contain correct data (since it
comes from the final pushed version of commits), versus local data which may
contain information about stripped commits.

As part of the big store refactor, this order got switched unintentionally. So
let's switch it back.
2016-06-16 10:22:31 -07:00
Jeroen Vaelen
07efaadb9d [remotefilelog] use hashlib to compute sha1 hashes
Summary:
hg-crew's c27dc3c3122 and c27dc3c3122^ were breaking our extensions:

```
$ hg log -r c27dc3c3122^
changeset:   9010734b79911d2d2e7405d91a4df479b35b3841
user:        Augie Fackler <raf@durin42.com>
date:        Thu, 09 Jun 2016 21:12:33 -0700
s.ummary:     cleanup: replace uses of util.(md5|sha1|sha256|sha512) with hashlib.\1
```

```
$ hg log -r c27dc3c3122
changeset:   0d55a7b8d07bf948c935822e6eea85b044383f00
user:        Augie Fackler <raf@durin42.com>
date:        Thu, 09 Jun 2016 21:13:23 -0700
s.ummary:     util: drop local aliases for md5, sha1, sha256, and sha512
```

I did a grep over facebook-hg-rpms to see what was affected:
```
$ grep "util\.\(md5\|sha1\|sha256\|sha512\)" -r ~/facebook-hg-rpms
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/basestore.py:            sha = util.sha1(filename).digest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/basestore.py:                sha = util.sha1(filename).digest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/shallowutil.py:    pathhash = util.sha1(file).hexdigest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/shallowutil.py:    pathhash = util.sha1(file).hexdigest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/debugcommands.py:    filekey = util.sha1(file).hexdigest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/historypack.py:        namehash = util.sha1(name).digest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/historypack.py:        node = util.sha1(filename).digest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/historypack.py:        files = ((util.sha1(filename).digest(), offset, size)
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/fileserverclient.py:    pathhash = util.sha1(file).hexdigest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/fileserverclient.py:    pathhash = util.sha1(file).hexdigest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/remotefilelog/basepack.py:        self.sha = util.sha1()
/home/jeroenv/facebook-hg-rpms/remotefilelog/tests/test-datapack.py:        return util.sha1(content).digest()
/home/jeroenv/facebook-hg-rpms/remotefilelog/tests/test-histpack.py:        return util.sha1(content).digest()
Binary file /home/jeroenv/facebook-hg-rpms/hg-crew/.hg/store/data/mercurial/revlog.py.i matches
/home/jeroenv/facebook-hg-rpms/fb-hgext/sparse.py:            return util.sha1(fh.read()).hexdigest()
/home/jeroenv/facebook-hg-rpms/fb-hgext/sparse.py:        sha1 = util.sha1()
/home/jeroenv/facebook-hg-rpms/fb-hgext/sparse.py:        sha1 = util.sha1()
/home/jeroenv/facebook-hg-rpms/fb-hgext/sparse.py:        sha1 = util.sha1()
/home/jeroenv/facebook-hg-rpms/fb-hgext/sparse.py:    sha1 = util.sha1()
/home/jeroenv/facebook-hg-rpms/mutable-history/hgext/simple4server.py:        sha = util.sha1()
/home/jeroenv/facebook-hg-rpms/mutable-history/hgext/evolve.py:        sha = util.sha1()
```
This diff is part of the fix.

Test Plan:
Ran the tests.
```
$MERCURIALRUNTEST -S -j 48 --with-hg ~/local/facebook-hg-rpms/hg-crew/hg
```

Reviewers: #sourcecontrol, ttung

Differential Revision: https://phabricator.intern.facebook.com/D3440041

Tasks: 11762191
2016-06-15 15:48:16 -07:00
Durham Goode
d860baf210 cachegroup: fix pack path use of cachegroup
Summary:
The pack path logic did not use the correct unix group when
remotefilelog.cachegroup was specified. This fixes that.

Test Plan:
I manually tested it by deleting a pack dir and running repack. This
is hard to create an automated test for since the feature isn't really cross
platform, and we don't have a way to know what groups they have on their
machine.

Reviewers: #sourcecontrol, ttung, rmcelroy

Reviewed By: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3400756

Tasks: 11584114

Signature: t1:3400756:1465342537:ed023f6dc830117df5e85e294a41486f072714c9
2016-06-08 09:09:06 -07:00
Durham Goode
c2d89eeebc test: add test to cover copyfrom issue
The previous commit fixed a bug where copyfrom data was represented incorrectly
in the local .hg/store/data remotefilelog blobs when the ancestor data was read
from a pack file. This commit adds a test for that situation.
2016-06-06 15:07:27 -07:00
Durham Goode
7cb6908a76 copyfrom: fix copy metadata in local blobs
The new pack stores return None for the copyfrom field, instead of the expected
''. We need the local file blob generator to handle this case, instead of just
putting None in the copyfrom field.
2016-06-06 14:16:06 -07:00
Durham Goode
3d127ad4a3 repack: cleanup empty directories
Summary:
Now that repack can clean up old remotefilelog blobs, let's have it also delete
any empty directories that get left behind.

Test Plan: Updated an existing test to cover it

Reviewers: mitrandir, lcharignon, #sourcecontrol, ttung, simonfar

Reviewed By: simonfar

Subscribers: simonfar

Differential Revision: https://phabricator.intern.facebook.com/D3385546

Signature: t1:3385546:1464972782:5ca63cf0a5589bb8a537957f50b4bc5ec4e0f0f5
2016-06-06 10:04:18 -07:00
Durham Goode
cfba85e8f3 Fix missing test glob 2016-06-03 17:32:39 -07:00
Durham Goode
6f3d6c53f5 utils: unify cachepath access through a util function
Summary:
Previously a bunch of different places accessed the cachepath through ui.config
directly. This is a problem because we need to resolve any environment variables
in the path, and some spots didn't do this. So let's unify all accesses through
a helper function that takes care of the environment variables.

Test Plan: Added a test

Reviewers: mitrandir, lcharignon, #sourcecontrol, ttung, simonfar

Reviewed By: simonfar

Subscribers: simonfar

Differential Revision: https://phabricator.intern.facebook.com/D3385583

Signature: t1:3385583:1464971813:5b9ee5ed3d6ff9f1a78cb9e0269e433844758c9d
2016-06-03 09:45:58 -07:00
Durham Goode
01595d2684 repack: allow background repacks to repack non-pack stores
Previously, background repacks would only repack pack files, which meant there
was no automated way to repack loose remotefilelog files without manually
running 'hg repack'. This allows incremental repacks to also pack the loose
files.

It also changes the config knob for background repacks, so we can enable pack
file usage without the server having to support it just yet.
2016-06-01 10:06:35 -07:00
Durham Goode
8b6c78b675 unionstore: allow incomplete delta chains
A previous patch allowed the unionmetadatastore to return partial histories if a
certain config provided. This allowed repack to get partial history information.
This patch does the same for deltachains. This isn't currently used, but will be
used in the future to allow repacking packs with partial delta chains by just
lifting them out of one pack and putting them directly in another.
2016-05-26 02:15:46 -07:00
Durham Goode
2450b3f243 unionstore: allow partial history output from union stores
Previousy, a union store required that it be able to compute the entire history
of the revision. This caused problems in repack, since it may only have a
partial history. Instead of throwing a KeyError and giving the repack algorithm
no history information at all, we add a config knob to let the repack logic
specify that it's ok with partial histories.

The next patch will do the same for contentstore.
2016-05-26 02:13:53 -07:00
Durham Goode
a2646d8da9 packs: change LookupError to KeyError
We've unified on KeyError being the error thrown when the pack is missing the
desired filename+filehash, but there were a few old places still using
LookupError. This patch changes them to also be KeyError.

This fixes an issue where a repack could throw a LookupError when it only had a
partial history of a file. Now that we throw a KeyError, the exception is caught
and handled appropriately.
2016-05-26 02:07:11 -07:00
Durham Goode
304c6f5bd0 pack: move common pack logic into basepack
Summary:
This moves the common logic from datapack and historypack into a common
basepack. At the moment the only common logic is the constructor, which handles
version checking, fanout initialization, and mmap stuff.

Test Plan: Ran the tests

Reviewers: mitrandir, #mercurial, ttung, mjpieters

Reviewed By: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D3306558

Signature: t1:3306558:1463474571:35d3d2e71849b8111e5455da2dd4810725a35523
2016-05-24 02:15:58 -07:00