Commit Graph

2547 Commits

Author SHA1 Message Date
Jun Wu
cea2bf8728 dag: limit segment level at open time
Summary:
At open time, it's pointless to attempt to create new levels. So let's just
read the existing max_level and do not try to build max_level + 1.

This turns out to save 300ms in profiling result.

Reviewed By: sfilipco

Differential Revision: D23494509

fbshipit-source-id: 4ea326a3cc21792790ea0b87e5bf608a94ae382b
2020-09-03 13:48:43 -07:00
Jun Wu
f238529a97 multilog: use per-log meta to pick up updated indexes
Summary:
With MultiLog, per-log meta was previously entirely ignored. However, they can
be useful for updated indexes. For example, application defines a new index,
and opens a Log via MultiLog. The application would expect the new index is
built only once. Without MultiLog, per-log meta is updated at open time in
place. With MultiLog, the updated index meta is not written back to the
multimeta so the new index would be rebuilt multiple times undesirably.

Update MultiLog to reuse the per-log meta if it's compatible so it can pick up
new indexes.

Reviewed By: sfilipco

Differential Revision: D23488212

fbshipit-source-id: c8b3e6b5589dbda2e76a143d15085862a93dae22
2020-09-03 13:48:43 -07:00
Jun Wu
f79e7657af multilog: stop writing poisoned per-log meta
Summary:
The poisoned meta makes investigation harder. ex. `debugdumpindexlog` won't
work on those logs.

Reviewed By: sfilipco

Differential Revision: D23488213

fbshipit-source-id: b33894d8c605694b6adf5afdaed45707fbd7357e
2020-09-03 13:48:43 -07:00
Jun Wu
99511f8743 dag: benchmark dag_ops on different IdDagStores
Summary:
Change dag_ops benchmarks to use different IdDagStores. An example run shows:

  benchmarking dag::iddagstore::indexedlog_store::IndexedLogStore
  building segments (old)                           856.803 ms
  building segments (new)                           127.831 ms
  ancestors                                          54.288 ms
  children (spans)                                  619.966 ms
  children (1 id)                                    12.596 ms
  common_ancestors (spans)                            3.050 s
  descendants (small subset)                         35.652 ms
  gca_one (2 ids)                                   164.296 ms
  gca_one (spans)                                     3.132 s
  gca_all (2 ids)                                   270.542 ms
  gca_all (spans)                                     2.817 s
  heads                                             247.504 ms
  heads_ancestors                                    40.106 ms
  is_ancestor                                       108.719 ms
  parents                                           243.317 ms
  parent_ids                                         10.752 ms
  range (2 ids)                                       7.370 ms
  range (spans)                                      23.933 ms
  roots                                             620.150 ms

  benchmarking dag::iddagstore::in_process_store::InProcessStore
  building segments (old)                           790.429 ms
  building segments (new)                            55.007 ms
  ancestors                                           8.618 ms
  children (spans)                                  196.562 ms
  children (1 id)                                     2.488 ms
  common_ancestors (spans)                          545.344 ms
  descendants (small subset)                          8.093 ms
  gca_one (2 ids)                                    24.569 ms
  gca_one (spans)                                   529.080 ms
  gca_all (2 ids)                                    38.462 ms
  gca_all (spans)                                   540.486 ms
  heads                                             103.930 ms
  heads_ancestors                                     6.763 ms
  is_ancestor                                        16.208 ms
  parents                                           103.889 ms
  parent_ids                                          0.822 ms
  range (2 ids)                                       1.748 ms
  range (spans)                                       6.157 ms
  roots                                             197.924 ms

  benchmarking dag::iddagstore::bytes_store::BytesStore
  building segments (old)                           724.467 ms
  building segments (new)                            90.207 ms
  ancestors                                          23.812 ms
  children (spans)                                  348.237 ms
  children (1 id)                                     4.609 ms
  common_ancestors (spans)                            1.315 s
  descendants (small subset)                         20.819 ms
  gca_one (2 ids)                                    72.423 ms
  gca_one (spans)                                     1.346 s
  gca_all (2 ids)                                   116.025 ms
  gca_all (spans)                                     1.470 s
  heads                                             155.667 ms
  heads_ancestors                                    19.486 ms
  is_ancestor                                        51.529 ms
  parents                                           157.285 ms
  parent_ids                                          5.427 ms
  range (2 ids)                                       4.448 ms
  range (spans)                                      13.874 ms
  roots                                             365.568 ms

Overall, InProcessStore > BytesStore > IndexedLogStore. The InProcessStore
uses `Vec<BTreeMap<Id, StoreId>>` for the level-head index, which is more
efficient on the "Level" lookup (Vec), and more cache efficient (BTree).
BytesStore outperforms IndexedLogStore because it does not need to verify
checksum on every read access - the checksum was verified at store creation
(IdDag::from_bytes).

Note: The `BytesStore` is something optimized for serialization, and hasn't been sent.

Reviewed By: sfilipco

Differential Revision: D23438174

fbshipit-source-id: 6e5f15188e3b935659ccde25fac573e9b963b78f
2020-09-02 18:54:12 -07:00
Jun Wu
84ad7a5351 dag: implement GetLock for all IdDagStores
Summary: This allows them to use the SyncableIdDag APIs.

Reviewed By: sfilipco

Differential Revision: D23438170

fbshipit-source-id: 7ec7288cfb8186b88f85f0212a913cb0dffe7345
2020-09-02 18:54:12 -07:00
Jun Wu
cfff0e9144 dag: make IdDag::prepare_filesystem_sync generic
Summary: Other IdDagStores can also use the API. This will be used in benchmarks.

Reviewed By: sfilipco

Differential Revision: D23438180

fbshipit-source-id: 565552b66372dcfbb268c397883f627491d6e154
2020-09-02 18:54:12 -07:00
Jun Wu
8874e07f9b dag: IdDagStore::reload -> GetLock::reload
Summary:
Similar to `IdDagStore::sync` -> `GetLock::persist`, `reload` is more related
to filesystem/internal state exchange, and should be protected by a lock.  So
let's move the API there, and requires a lock.

Reviewed By: sfilipco

Differential Revision: D23438169

fbshipit-source-id: 4228106b7739a1a758677adfddd213ad54aa4b6a
2020-09-02 18:54:12 -07:00
Jun Wu
d633576880 dag: remove NameDag::reload
Summary:
`NameDag::reload` is used in `flush` to get a "fresh" NameDag.
In a future diff the `IdDag::reload` API gets changed, so let's
remove NameDag's use of it.

Instead, let's just re-`open` the path again to get a fresh NameDag.
It's a bit more expensive but probably okay, and easier to understand.
`get_new_segment_size()` was added as an internal API to preserve tests.

This also solves an issue where `NameDag` cannot recover properly if its
`flush` fails, because the old `NameDag` state is not lost.

After removing `NameDag::reload`, `idMap::reload` is no longer used publicly
and was made private.

Reviewed By: sfilipco

Differential Revision: D23438179

fbshipit-source-id: 0a32556a2cd786919c233d7efcae1cb9cbc5fb09
2020-09-02 18:54:11 -07:00
Jun Wu
8e16e4260f dag: IdDagStore::sync -> GetLock::persist
Summary:
The word "sync" is bi-directional: flush + reload. It was indexedlog::Log's
behavior. However, in the IdDag context "sync" is confusing - it is actually
only used to write data out, with protection from lock. Rename to `persist`
to clarify it's memory -> disk. Besides, requires a reference to a lock object
as a lightweight prove that some lock is held.

Reviewed By: sfilipco

Differential Revision: D23438175

fbshipit-source-id: 3d9ccd7431691d1c4e2ee74f3c80d95f5e7243b5
2020-09-02 18:54:11 -07:00
Jun Wu
3ad58ff945 dag: make SyncableIdMap use &mut IdMap instead of IdMap
Summary:
This removes the need of cloning `IdMap`.

SyncableIdMap is a bit tricky. I added some comments to clarify things.

Reviewed By: sfilipco

Differential Revision: D23438176

fbshipit-source-id: fe66071da07067ed6c53a6437790af1d81b28586
2020-09-02 18:54:11 -07:00
Jun Wu
23f9bec22b dag: move IdDagStore impls to separate files
Summary: This makes `iddagstore.rs` cleaner.

Reviewed By: sfilipco

Differential Revision: D23438177

fbshipit-source-id: 465cec2231a084a36b20da8e413cb9272f64a00a
2020-09-02 18:54:10 -07:00
Jun Wu
4e9200db44 dag: test IndexedLogIdDagStore
Summary:
Make the test cover IndexedLogIdDagStore. The only change is the parent index
returns children in a different order.

Reviewed By: sfilipco

Differential Revision: D23438173

fbshipit-source-id: bcfabcd329e45bbc5e7e773103fa42307c23c35d
2020-09-02 18:54:10 -07:00
Stefan Filip
1ddf5aaa0e tools: add location-to-hash command to read_res
Summary:
There aren't too many thigs that we can do with the responses that we get back
from the server. Thigs are somewhat application specific for this endpoint.
One option that is not available right now and might make sense to add is
limiting the number of entries that are printed for a given location.

Reviewed By: kulshrax

Differential Revision: D23456220

fbshipit-source-id: eb24602c3dea39b568859b82fc27b7f6acc77600
2020-09-02 17:20:43 -07:00
Stefan Filip
932450fb15 handlers: update location-to-hash endpoint with count parameter
Summary:
To reduce the size over the wire on cases where we would be traversing the
changelog on the client, we want to allow the endpoint to return a whole parent
chain with their hashes.

Reviewed By: kulshrax

Differential Revision: D23456216

fbshipit-source-id: d048462fa8415d0466dd8e814144347df7a3452a
2020-09-02 17:20:42 -07:00
Stefan Filip
7122cdded7 types: rename Location to CommitLocation
Summary:
Renaming all the LocationToHash related structures to CommitLocationToHash.
This is done for consistency. I realized the issue when the command for reading
the request from cbor was not what I was expecting it to be. The reason was that
the commit prefix was used inconsistently for LocationToHash.

Reviewed By: kulshrax

Differential Revision: D23456221

fbshipit-source-id: 0181dcaf81368b978902d8ca79c5405838e4b184
2020-09-02 17:20:42 -07:00
Durham Goode
537d5858bd archive: block full archives in large repositories
Summary:
The default archive behavior archives the entire working copy. That is
undesirable and easy to accidentally trigger in a large repository. Let's
prevent it and require users to specify what they want archived.

Reviewed By: quark-zju

Differential Revision: D23464818

fbshipit-source-id: c39a631d618c2007e442e691cda542400cf8f4c3
2020-09-02 11:38:08 -07:00
Stefan Filip
c2079c3464 revisionstore: use async-runtime crate for lfs
Summary:
Replacing uses of the custom Runtime in lfs with the global runtime in the
`async-runtime` crate.

Reviewed By: xavierd

Differential Revision: D23468347

fbshipit-source-id: 61d2858634a37eb2d7d807104702d24889ec047a
2020-09-02 10:01:08 -07:00
Thomas Orozco
de260c7e9d py3: fix debugstacktrace
Summary:
debugstacktrace is broken right now on Python 3: it wants to write to stderr,
which expects `bytes`, but it tries to write a `str`. This fixes it.

Reviewed By: DurhamG

Differential Revision: D23447984

fbshipit-source-id: 5896ae858f6022276fa47e08636c700159a2a678
2020-09-02 00:53:28 -07:00
Jun Wu
a0223bc7e7 dag: make iddagstore test generic
Summary: Make it possible to test other IdDagStores.

Reviewed By: sfilipco

Differential Revision: D23438178

fbshipit-source-id: e5fc1b20833c71dd7569c77c31c76a26a6e357fe
2020-09-01 23:58:04 -07:00
Jun Wu
c84653c7a9 py3: fix a crecord encoding issue
Summary: This only happens if specified context shows up.

Reviewed By: ytsheng

Differential Revision: D23460476

fbshipit-source-id: 788e236bd8e28918afa6b1e0a4e1be297b6f5a66
2020-09-01 21:24:53 -07:00
Jun Wu
211739f00c dag: remove SpanSetAsc
Summary:
Now SpanSet can easily support `push_front`, we can just use SpanSet
efficiently without SpanSetAsc.

Reviewed By: sfilipco

Differential Revision: D23385246

fbshipit-source-id: b2e0086f014977fa990d5142e6eee844293e7ca5
2020-09-01 21:02:08 -07:00
Jun Wu
64bdf70811 dag: add SpanSet::intersection_span_min
Summary: To remove SpanSetAsc, its API needs to be implemented on SpanSet.

Reviewed By: sfilipco

Differential Revision: D23385250

fbshipit-source-id: ebd9d537287b5c1cde6e2c52ffb6da57dbd71852
2020-09-01 21:02:08 -07:00
Jun Wu
16eaceafe9 dag: use VecDeque for SpanSet
Summary: This will make it possible to `push_front` and remove SpanSetAsc special case.

Reviewed By: sfilipco

Differential Revision: D23385249

fbshipit-source-id: 63ac67e9bce7cb281236399b3fb86eba23bbf8a0
2020-09-01 20:53:32 -07:00
Jun Wu
71f101054a dag: implement binary_search_by for VecDeque
Summary:
This makes it easier to replace Vec<Span> with VecDeque<Span> in SpanSet for
efficient push_front and deprecates SpanSetAsc (which uses Id in a bit hacky
way - they are not real Ids).

Reviewed By: sfilipco

Differential Revision: D23385245

fbshipit-source-id: b612cd816223a301e2705084057bd24865beccf0
2020-09-01 20:38:29 -07:00
Jun Wu
d8225764a5 py3: speed up simplemerge
Summary:
One user reports very very slow rebase (tens of minutes and running). The
commit is not very large. Python 2 can complete the rebase in 6 seconds.
I tracked it down to this code path. Making the change makes Python 3
rebase fast too (< 10 seconds). I haven't tracked down exactly why Python
3 is slow yet (maybe N^2 a += b)?

Some numbers about the slow merge:

  ipdb> p len(m3.atext)
  17984924
  ipdb> p len(m3.btext)
  17948110
  ipdb> p len(m3.a)
  613353
  ipdb> p len(m3.b)
  612129
  ipdb> p len(m3.base)
  612135

Reviewed By: singhsrb

Differential Revision: D23441221

fbshipit-source-id: 14b725439f4ecd3352edca512cdde32958b2ce29
2020-09-01 20:32:10 -07:00
Jun Wu
2d02d3b0f7 dag: validate SpanSet order and no mergable adjacent spans
Summary:
Previously the `is_valid()` function only checks about ordering.
Make it also check "no mergeable adjacent spans" and `span.low<=span.high`.
To provide better debug messages, the function does assertions
directly without returning a bool.

Reviewed By: sfilipco

Differential Revision: D23385247

fbshipit-source-id: 84829e9242e47e68dc2a4b2a6775b13331eba959
2020-09-01 20:27:03 -07:00
Jun Wu
4bf5817dad dag: always merge adjacent spans in SpanSet
Summary:
Previously, `SpanSet::from_sorted_spans` allows having adjacent spans like
`[1..=2, 3..=4]`, while `SpanSet::from_spans` would merge them into `[1..=4]`.
Change it so `SpanSet::from_sorted_spans` merges them too.  This simplifies
the `contains` logic and could make some Sets more efficient.

Reviewed By: sfilipco

Differential Revision: D23385248

fbshipit-source-id: 85b5ba9533f15034779e93255085a4fa09c6328a
2020-09-01 20:04:12 -07:00
Jun Wu
afa787bd5c rage: do not report 'serve' commands in sigtrace section
Summary:
There were some rage pastes that have very long "sigtrace" section (ex.  P141069793)
It turns out the sigtrace has lots of "serve" commands that is started in a
non-forking mode, producing very long traces like:

  Tracing Data:
  Process 726702 Thread 2610476:
     Start Dur.ms | Name                                              Source
         0    ... | Run Command                                       hgcommands::run line 296
                  | - pid = 726702                                    :
                  | - uid = 117869                                    :
                  | - nice = 0                                        :
                  | - args = ["/opt/fb/mercurial/hg.real","...        :
                  | - parent_pids = [2610476,1]                       :
                  | - parent_names = ["/opt/fb/mercurial/hg.real",""] :
                  | - exit_code = 0                                   :
                  | - max_rss = 0                                     :
        35    ... | Main Python Command                               (perftrace)
        35    +22  \ Repo Setup                                       edenscm.mercurial.hg line 168
                    | - local = true                                  :
        70   +802  \ Main Python Command                              (perftrace)
        72   +799   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
        74   +537   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
       940   +914  \ Main Python Command                              (perftrace)
       943   +910   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
       943   +617   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
      1875   +866  \ Main Python Command                              (perftrace)
      1877   +863   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
      1878   +604   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
      2759  +2208  \ Main Python Command (719 times)                  (perftrace)
      3155   +860  \ Main Python Command                              (perftrace)
      3158   +856   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
      3158   +543   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
      4068   +883  \ Main Python Command                              (perftrace)
      4071   +879   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
      4071   +591   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
      4967   +913  \ Main Python Command                              (perftrace)
      4969   +910   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
      4969   +621   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
      6630   +922  \ Main Python Command                              (perftrace)
      6633   +918   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
      6633   +640   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
      7615   +856  \ Main Python Command                              (perftrace)
      7622   +849   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
      7622   +581   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
      8487   +951  \ Main Python Command                              (perftrace)
      8490   +947   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
      8490   +671   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
    139275   +794  \ Main Python Command                              (perftrace)
    139278   +790   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
    139278   +539   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
    140132   +837  \ Main Python Command                              (perftrace)
    140135   +832   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
    140135   +544   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
    140992   +814  \ Main Python Command                              (perftrace)
    140994   +811   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
    140994   +546   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
    306862   +864  \ Main Python Command                              (perftrace)
    306865   +860   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
    306865   +586   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
    307801   +858  \ Main Python Command                              (perftrace)
    307804   +854   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
    307804   +587   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
    308690   +874  \ Main Python Command                              (perftrace)
    308693   +869   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
    308693   +610   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
    506391   +924  \ Main Python Command                              (perftrace)
    506396   +917   | Status                                          edenscm.mercurial.dirstate line 957
                    | - A/M/R Files = 0                               :
    506396   +645   | Get EdenFS Status                               (perftrace)
                    | - status = true                                 :
    507401   +898  \ Main Python Command                              (perftrace)
    ....

Our chg usage does not start non-forking servers, those are started by apparently something related to emacs:

  args = ['--config', 'ui.interactive=True', '--config', 'ui.editor=emacsclient', '--config', 'extensions.shelve=', 'serve', '--cmdserver', ...]

Hide them in sigtrace to make rage paste shorter.

Reviewed By: DurhamG

Differential Revision: D23459991

fbshipit-source-id: 7ccc27dbe5ef03e0b97dbfec57213e5478003b1c
2020-09-01 19:57:41 -07:00
Jun Wu
5f0a6f35af py3: fix conflictinfo compatibility
Summary: File content needs to be encoded.

Reviewed By: DurhamG

Differential Revision: D23463706

fbshipit-source-id: e8e512668452618e3b139d7d94ec8776f2b6b25b
2020-09-01 18:31:35 -07:00
Jun Wu
062a83cc16 restack: fix bookmark movement with partial successful auto restack
Summary:
See the test change. Partially successful auto restack should have bookmarks
moved.

Reviewed By: DurhamG

Differential Revision: D23441932

fbshipit-source-id: 07e509a70bcc5cf81f702d40ec1b8dc4a5a781ff
2020-09-01 18:05:44 -07:00
Jun Wu
8191be83c1 tests: add a test for auto rebase bookmark movement issue
Summary: Reported By: asukhachev.

Reviewed By: DurhamG

Differential Revision: D23441931

fbshipit-source-id: b07f47e6796d4d0363250b3b1463f829bb5d0efa
2020-09-01 18:05:44 -07:00
Jun Wu
b3df065db5 debugshell: improve "%trace" UX
Summary: Print hints about how to enable detailed Python tracing.

Reviewed By: kulshrax

Differential Revision: D23437210

fbshipit-source-id: 009425a83945f9b5af2a6280c2572a782c6b349a
2020-09-01 13:49:13 -07:00
Thomas Orozco
0ab9638ef6 py3: fix lfs debuglfsreceive{,all}
Summary:
Those commands are broken right now: they try to write bytes but don't use
`writebytes`.

Reviewed By: DurhamG

Differential Revision: D23450968

fbshipit-source-id: 5d554771459f81718d90e5bad9a4c439cbb05d97
2020-09-01 11:04:16 -07:00
Thomas Orozco
46ab9553bc py3: fix lfs uploads not working anymore
Summary:
When Python 3 wants to upload a file-like object, it does something a bit
awkward: it sets the `Transfer-Encoding` to `chunked`, but doesn't actually
chunk the data. Also, for some reason ,it still sets the `Content-Length`. I'm
not sure where that is coming from.

The thing is, when you set `Transfer-Encoding` to `chunked`, you do need to
chunk, or the other end is going to get very confused.

Unfortunately, this is not what happens here (note that the "send" logs are
from enabling http tracing in Python here, and those logs are basically one
line before `.send()` into a socket, so the chunking doesn't appear to happen
elsewhere):

```
[torozco@devbig051]~/opsfiles_bin % echo "aaaa" | ~/fbcode/buck-out/gen/eden/scm/__hg-py3__/hg-py3.sh debuglfssend https://mononoke-lfs.internal.tfbnw.net/opsfiles_bin
send: b'PUT /opsfiles_bin/upload/11a77c3d96c06974b53d7f40a577e6813739eb5c811b2a86f59038ea90add772/5 HTTP/1.1\r\nAccept-Encoding: identity\r\nContent-length: 5\r\nx-client-correlator: tQT3yBfFEzhVtqI5\r\naccept: application/mercurial-0.1\r\ncontent-type: application/x-www-form-urlencoded\r\nhost: mononoke-lfs.internal.tfbnw.net\r\ntransfer-encoding: chunked\r\nuser-agent: mercurial/4.4.2_dev git/2.15.1\r\n\r\n'
sendIng a read()able
send: b'aaaa\n'
reply: 'HTTP/1.1 400 Bad request\r\n'
header: Content-Type: text/html; charset=utf-8
header: Access-Control-Allow-Origin: *
header: proxy-status: client_read_error; e_upip="AcLKajO63Vab0hC4kzGZQsqck3P_YOu7HsBzshC-NCbuo31tlWWqCiVw5xVLh44LYYe7qioCPqYSb8-1cBpdvFDZb_t5oYRP1Q"; e_proxy="AcJjRKHG02qo6Bv6fEPCUVF7DpCyrq3rmSnXhRLWakKWREEvVpk4jc-tzDyG6l9jvn3vNo8PYPG_5hLtC3L1"
header: Date: Tue, 01 Sep 2020 13:10:35 GMT
header: Connection: close
header: Content-Length: 2959
```

What's a bit confusing to me here is where this Content-length header comes
from. Indeed, normally Python 3 will:

- Not infer a content-length for file-like objects (which is what we have)
  https://fburl.com/ms94eq31
- Set Transfer-Encoding if no Content-Length is present:
  https://fburl.com/f81g8v2j

So, it's a bit unexpected that a) we have a Content-Length (we shouldn't), and
that we b) also have a Transfer-Encoding header. That said, setting the
Content-Length does fix the problem, so that's what this diff does.

Reviewed By: DurhamG

Differential Revision: D23450969

fbshipit-source-id: e1f535ff3d0b49c0c914130593d9aebe89ba18ca
2020-09-01 11:04:16 -07:00
Stanislau Hlebik
2e2e2432a7 sparse: warn if dirstate includes marker files
Summary:
As a follow up to the previous diff, let's also warn if dirstate includes
marker files that should not be included in any sparse profiles.

Reviewed By: DurhamG

Differential Revision: D23414361

fbshipit-source-id: 3d171328bf0ba5754e5bacde85f09abb4fed8603
2020-08-31 23:21:41 -07:00
Jun Wu
56d0255228 extutil: drop runbgcommand
Summary: Callsites were migrated to `util.spawndetached`.

Reviewed By: DurhamG

Differential Revision: D23124753

fbshipit-source-id: f0345461a3f79f9bb6ff3a58e00cdf0ed1893645
2020-08-31 17:34:49 -07:00
Jun Wu
2cdca65aed remotefilelog: runshellcommand -> spawndetached
Summary: There seems to be no need to use a shell.

Reviewed By: DurhamG

Differential Revision: D23124756

fbshipit-source-id: 7de1c23e2325fe88dc4c6a2c90563d06f109ed2f
2020-08-31 17:34:49 -07:00
Jun Wu
ffb93ca839 commandcloud: runbgcommand -> spawndetached
Summary:
The Rust process utility avoids issues with interaction with Python and can do file
redirection on Windows.

Reviewed By: DurhamG

Differential Revision: D23124755

fbshipit-source-id: f72b88bafd19b3b41e53afbf6a4095d0d6bcb93a
2020-08-31 17:34:49 -07:00
Jun Wu
6e2a90ddb5 hooks: add predefined hook to run fsync
Reviewed By: DurhamG

Differential Revision: D22993217

fbshipit-source-id: 2cfb6b26479cd7dad02419fb76fa5d3ca5dd66db
2020-08-31 17:34:49 -07:00
Jun Wu
a01693df0e util: use Rust pyprocess to implement spawndetached
Summary:
The Rust bindings handle the cross-platform differences and avoids issues
with Python / Rust interaction. Use it.

As we're here, extend the API to support cwd and env.

Reviewed By: DurhamG

Differential Revision: D23124171

fbshipit-source-id: fdc13f6eaeb25c05b53d385eb220af33dad984e1
2020-08-31 17:34:48 -07:00
Jun Wu
a90c8ea775 bindings: export rust process handling to Python
Summary:
Spawning processes turns out to be tricky.

Python 2:

- "fork & exec" in plain Python is potentially dangerous. See D22855986 (c35b8088ef).
  Disabling GC might have solved it, but still seems fragile.
- "close_fds=True" works on Windows if there is no redirection.
- Does not work well with `disable_standard_handle_inheritability` from `hgmain`.
  We patched it. See `contrib/python2-winbuild/0002-windows-make-subprocess-work-with-non-inheritable-st.patch`.

Python 3:

- "subprocess" uses native code for "fork & exec". It's safer.
- (>= 3.8) "close_fds=True" works on Windows even with redirection.
- "subprocess" exposes options to tweak low-level details on Windows.

Rust:

- No "close_fds=True" support for both Windows and Unix.
- Does not have the `disable_standard_handle_inheritability` issue on Windows.
- Impossible to cleanly support "close_fds=True" on Windows with existing stdlib.
  https://github.com/rust-lang/rust/pull/75551 attempts to add that to stdlib.
  D23124167 provides a short-term solution that can have corner cases.

Mercurial:

- `win32.spawndetached` uses raw Win32 APIs to spawn processes, bypassing
  the `subprocess` Python stdlib.
- Its use of `CreateProcessA` is undesirable. We probably want `CreateProcessW`
  (unless `CreateProcessA` speaks utf-8 natively).

We are still on Python 2 on Windows, and we'd need to spawn processes correctly
from Rust anyway, and D23124167 kind of fills the missing feature of `close_fds=True`
from Python. So let's expose the Rust APIs.

The binding APIs closely match the Rust API. So when we migrate from Python to
Rust, the translation is more straightforward.

Reviewed By: DurhamG

Differential Revision: D23124168

fbshipit-source-id: 94a404f19326e9b4cca7661da07a4b4c55bcc395
2020-08-31 17:34:48 -07:00
Jun Wu
b7f2ee577a spawn-ext: extend Command::spawn to avoid inheriting fds
Summary:
The Rust upstream took the "set F_CLOEXEC on every opened file" approach and
provided no support for closing fds at spawn time to make spawn lightweight [1].

However, that does not play well in our case:
- On Windows:
  - stdin/stdout/stderr are not created by Rust, and inheritable by
    default (other process like `cargo`, or `dotslash` might leak them too).
  - a few other handles like "Null", "Afd" are inheritable. It's
    unclear how they get created, though.
  - Fortunately, files opened by Python or C in edenscm (ex. packfiles) seem to
    be not inheritable and do not require special handling.
- On Linux:
  - Files opened by Python or C are likely lack of F_CLOEXEC and need special
    handling.

Implement logic to close file handlers (or set F_CLOEXEC) explicitly.

[1]: https://github.com/rust-lang/rust/issues/12148

Reviewed By: DurhamG

Differential Revision: D23124167

fbshipit-source-id: 32f3a1b9e3ae3a9475609df282151c9d6c4badd4
2020-08-31 17:34:48 -07:00
Jun Wu
b3fd513ea4 util: make gethgcmd more reliable
Summary:
It uses `sys.argv`, which might be rewritten by `debugshell`. Capture
`sys.argv` to make hgcmd more reliable.

Reviewed By: DurhamG

Differential Revision: D22993215

fbshipit-source-id: 5fa319e8023b656c6cdf96cb3229ea9f2c9b9b99
2020-08-31 17:34:48 -07:00
Jun Wu
333177101f hooks: add a hook point after write commands
Summary: This allows us to run commands after changes were made to the repo.

Reviewed By: DurhamG

Differential Revision: D22993218

fbshipit-source-id: d9943dcda94da42970fb9107f48f4caa14b6a9d4
2020-08-31 17:34:48 -07:00
David Tolnay
75c2118e01 Remove crate_root from Rust dependency info
Reviewed By: danobi

Differential Revision: D23430948

fbshipit-source-id: c4b374021325fc247121ceecd0e82a0291aa75d6
2020-08-31 14:43:24 -07:00
Jun Wu
9aa9d022ae util: stop using time.perf_counter() for timer()
Summary:
Some code paths (ex. metalog.commit) use `util.timer()` as a way to get
seconds since epoch, and get 0 for tests. Other use-cases of `util.timer()`
are ad-hoc time measure for displaying speed / progress. They do not need high
precision or strong guarantee that the clock does not go backwards. Drop the
`time.perf_counter()` to meet the first use-case's expectation.

Reviewed By: singhsrb

Differential Revision: D23431253

fbshipit-source-id: 8bf2d1ed32e284e17285742e1d0fd7178f181fb3
2020-08-31 13:04:54 -07:00
Jun Wu
9f33746b31 histedit: do not show revision numbers
Summary:
With segments backend, the revision numbers will be longer than commit hashes
and are confusing.

Reviewed By: DurhamG

Differential Revision: D23408971

fbshipit-source-id: e2057fa644fc7b6be4291f879eee3235bb4e687b
2020-08-31 11:57:53 -07:00
Jun Wu
96548cade8 remotefilelog: do not assume range(len(cl)) are valid revs in _linkrev
Summary: `range(len(cl))` contains invalid revs with segments backend.

Reviewed By: DurhamG

Differential Revision: D23411209

fbshipit-source-id: 2f83a5402bb46824cf38871926c1954507b64b56
2020-08-31 11:57:53 -07:00
Jun Wu
ff2d572717 changelog2: avoid excessive memory usage during large pulls
Summary:
Pulling from older repos (ex. years ago) could require GBs of commit text data.
Flush commit data if they exceed certain size.

This is for revlog compatibility.
In the future we probably just make commit text lazy to avoid this kind of issues.

Reviewed By: DurhamG

Differential Revision: D23408834

fbshipit-source-id: 273384f5a05be07877bb1c9871c17b53ba436233
2020-08-31 11:57:53 -07:00
Jun Wu
01c551bb30 hgcommits: add flush_commit_data API
Summary: This would be used to avoid excessive memory usage during pull.

Reviewed By: DurhamG

Differential Revision: D23408833

fbshipit-source-id: 8edd95ab8201697074f65cc118d14755a230567d
2020-08-31 11:57:53 -07:00