Commit Graph

47180 Commits

Author SHA1 Message Date
Jun Wu
80124113f3 gc: do not break indexedlog data
Summary:
The code is broken:

- It happily deletes the `rotatelog/latest` file, which is unexpected by RotateLog.
- It can also delete `packs` on Windows, because `\packs\` never matches `/packs/`.

and suboptimal:

- It does unnecessary tests in inner loops.
- It does unnecessary subdir walks.

Fix them.

Reviewed By: singhsrb

Differential Revision: D17777647

fbshipit-source-id: 08afbb1439e36bb5194053e52e1901b538e42ba3
2019-10-07 08:36:47 -07:00
Stanislau Hlebik
cfeedf2c27 add missing newline
Summary:
Before this diff output from two lines was glued:

```
No known server bookmarkssearching for changes
```

Reviewed By: farnz

Differential Revision: D17786686

fbshipit-source-id: ddb96c7fa391f4ca07a18a7a2145fff2b9d249bb
2019-10-07 03:30:28 -07:00
Jun Wu
9c02091449 revisionstore: stop auto recovery on indexedlog errors
Summary:
The auto recovery logic is more harmful than useful as it hides real errors and
can corrupt other running hg processes. Remove them so we can see the real
errors, and since we now have a proper `hg doctor` command that can fix things
more properly.

If this turned out to be an issue, we should investigate why data corruption
happened (they are only expected for force reboot / os crash cases), or add
some configurable auto recovery in the Python layer.

Reviewed By: xavierd

Differential Revision: D17755607

fbshipit-source-id: 0916b65d07da36af6c5aa6d2d6b69fa83d29d530
2019-10-04 20:37:06 -07:00
Jun Wu
bb9d5531ce commands: add a doctor command
Summary: For now it just repairs indexedlog stores. We can add other stuff later.

Reviewed By: xavierd

Differential Revision: D17755606

fbshipit-source-id: 0599ac0e8e5c049f4cf96ae30df53c920dee21a6
2019-10-04 20:37:06 -07:00
Jun Wu
7d013975ca remotefilelog: remove repo.fileslog side effect in reposetup
Summary:
The side effects are undesirable as an error in the store can crash any repo command.
That makes it harder to implement a `doctor` command to fix the stores.

Reviewed By: xavierd

Differential Revision: D17755609

fbshipit-source-id: 3ba1774de965c4d896178adc47df805f6e465071
2019-10-04 20:37:06 -07:00
Jun Wu
8595d3fde8 bindings: add APIs to call into indexedlog-based store repair methods
Summary: This exposes the repair API to the Python world.

Reviewed By: xavierd

Differential Revision: D17755604

fbshipit-source-id: fb5089a1f0648b18d4a338c3c73e939d5ce37bed
2019-10-04 20:37:05 -07:00
Jun Wu
3e99b568eb revisionstore: add repair method on indexedlog-based stores
Summary: This just calls into the indexedlog repair API.

Reviewed By: xavierd

Differential Revision: D17755608

fbshipit-source-id: ff6c99cadfc900f8ab8c49fe887161492e08c692
2019-10-04 20:37:05 -07:00
Jun Wu
0da18952b4 indexedlog: add a repair method for rotate::OpenOptions
Summary:
This runs an explicit "verify & repair" on all logs in a RotateLog and attempt
to fix the "latest" file.

Reviewed By: xavierd

Differential Revision: D17755605

fbshipit-source-id: eaab4a4e76060a4d094e2bbd42baca9f1e684240
2019-10-04 20:37:05 -07:00
Jun Wu
7dff3e1a9d indexedlog: ensure created logs are empty in RotateLog
Summary:
Use the new API `log::OpenOptions::delete_content` to ensure logs are empty.
This auto fixes issues where a stale directory with broken content can prevent
RotateLog from rotating things.

This has some side effects:

- Logs are logically empty but physically have some bytes - test change
- Reveals an integer overflow panic - fixed in logs.rs

Reviewed By: xavierd

Differential Revision: D17741995

fbshipit-source-id: 51904090dad60718deefa537cf4db91554f3ac31
2019-10-04 20:37:04 -07:00
Jun Wu
8160cd590e indexedlog: make logs in RotateLog lazy
Summary:
Previously, RotateLog loads as many logs as it can during initialization or
sync. However, that could be undesirable because loading too many logs can
take time. Make log loading lazy except for the `latest` log to reduce
overhead.

Reviewed By: xavierd

Differential Revision: D17740792

fbshipit-source-id: cde4c1799ed55d390dadaa5bd34f3d2c6d0e1cf7
2019-10-04 20:37:04 -07:00
Jun Wu
3707eea6da indexedlog: add an API to delete log content
Summary: This can be used in LogRotate to ensure new logs being created are empty.

Reviewed By: xavierd

Differential Revision: D17732138

fbshipit-source-id: 57c86e586decf6e26fa7ebe2d74a93afb4559f43
2019-10-04 20:37:04 -07:00
Jun Wu
40a35334bb indexedlog: rewrite Log::repair
Summary:
Fix multiple issues. Namely:
- Move it to OpenOptions, since Log::repair requires a Log, and it's problematic to open a corrupted Log.
- Make it also repair indexes, since otherwise the performance would be terrible.
- Output some human messages about what was done.
- Make it safe (no SIGBUS) by not truncating data. This works because D16076658
  made Log ignore the physical file length, and only use the metadata length.
- Added a strong test which drove a lot of fixes in this stack.

Reviewed By: xavierd

Differential Revision: D17741210

fbshipit-source-id: 9363dc2f38e66df30b5ed0323455bf67b68227c1
2019-10-04 20:37:03 -07:00
Jun Wu
43919375bb indexedlog: make Log::sync handle non-append-only changes
Summary:
Add extra metadata to help detect non-append-only changes and make Log::sync
handle it by automatically reloading log and indexes. This can reduce chances
that data were written to a different log incorrectly.

Reviewed By: xavierd

Differential Revision: D17732137

fbshipit-source-id: 33668913f1695a6c02af5b81a40214e5a521ef09
2019-10-04 20:37:03 -07:00
Jun Wu
415dad5587 indexedlog: rewrite Log::rebuild_indexes
Summary:
Fix multiple issues. Namely:
- Make it composiable in 'repair' by adding a 'assume_locked' flavor.
- Make it possible to only rebuild corrupted indexes.
- Output some human messages about what was done.

Reviewed By: xavierd

Differential Revision: D17741997

fbshipit-source-id: d0f56544f69b2536459a7580e3c052d45454667d
2019-10-04 20:37:03 -07:00
Jun Wu
0d64d04958 indexedlog: add interal API Log::open_assume_locked
Summary:
This allows `Log::open` logic to be used in cases where the directory lock was
already taken. Namely, in `repair`, `sync` and other write operations.

Yes, the old code is wrong and can deadlock.

Reviewed By: xavierd

Differential Revision: D17742002

fbshipit-source-id: 8fbe02bee3e77e00743212ea5b456128d4371362
2019-10-04 20:37:03 -07:00
Jun Wu
d786f5d557 indexedlog: make create_in_memory take a ref
Summary: This makes OpenOptions reusable. It helps writing shorter tests.

Reviewed By: xavierd

Differential Revision: D17742000

fbshipit-source-id: aa6dd5cd1936ba73d5644ed90cce8f75c71585f7
2019-10-04 20:37:02 -07:00
Jun Wu
dec9087a1c indexedlog: update context message in Log::update_index_for_on_disk_entry_unchecked
Summary:
This is a small improvement. It turns out to be useful when debugging some
other issues.

Reviewed By: xavierd

Differential Revision: D17741994

fbshipit-source-id: 73e3ba2f222366c1b1c614da08f8869011eb87f8
2019-10-04 20:37:02 -07:00
Jun Wu
296079cb79 indexedlog: add Index::verify
Summary:
This allows Log to check whether logs are corrupted and decide whether to
rebuild them.

Reviewed By: xavierd

Differential Revision: D17741996

fbshipit-source-id: 37c0a2789465877680bf8ebc50afe5d132574f8e
2019-10-04 20:37:02 -07:00
Jun Wu
2e3c65185e exec: remove indexedlog_repair util
Summary: Upcoming changes will break it. Remove it for now.

Reviewed By: xavierd

Differential Revision: D17742005

fbshipit-source-id: 762874568a87188f149ad297a708687e4b961570
2019-10-04 20:37:01 -07:00
Jun Wu
af2e7eee4d indexedlog: add epoch to log metadata
Summary:
The new metadata is used to more reliably detect non-append-only changes.

This makes is possible to correctly reload or handle non-append-only cases, namely
`repair` and `rm -rf`.

Reviewed By: xavierd

Differential Revision: D17742003

fbshipit-source-id: effa2ede817bb155ba4614da1be2bc497f0f1eb9
2019-10-04 20:37:01 -07:00
Jun Wu
592e4e984f indexedlog: add ChecksumTable::new_empty
Summary:
ChecksumTable::open fails if the on-disk checksum is broken. That makes it
harder to repair checksum issues. Add a new API to fix it.

This makes `repair` able to fix index checksum corruption, covered by an
upcoming test about `repair`.

Reviewed By: xavierd

Differential Revision: D17741998

fbshipit-source-id: e43f7599d1e7e119537b075dd94e56f61779c605
2019-10-04 20:37:01 -07:00
Jun Wu
d686a36b9c indexedlog: mark certain io::Error as corruption
Summary:
Certain IO errors are surely data corruption - the data can be read
(no permission or resource errors), but does not meet expectations
(format or range).

This fixes some cases covered by an upcoming test about `repair`.

Reviewed By: xavierd

Differential Revision: D17742001

fbshipit-source-id: 8436d477a25efe09bd5a763371df913acbfccc68
2019-10-04 20:37:01 -07:00
Jun Wu
0343dcf08a commands: remove import stat
Summary: This should fix lints about `stat` being redefined.

Reviewed By: kulshrax

Differential Revision: D17766500

fbshipit-source-id: ca326ae9d0c922c60ac26f0e367a70e11d20fcd3
2019-10-04 18:12:10 -07:00
Jun Wu
fd7a6868a3 testutil/dott: fix setconfig with multiple args
Summary: The old code incorrectly only respects the last argument of `setconfig`.

Reviewed By: xavierd

Differential Revision: D17766375

fbshipit-source-id: ec70c10eb62803c9f3d8dde5d88f91d076a55049
2019-10-04 18:12:10 -07:00
Jason Fried
8d130ad96d pep-479 codemod
Summary:
https://www.python.org/dev/peps/pep-0479/

If you are in a generator (a function that uses "yield") you are never allowed to raise StopIteration instead you should just `return` or pass out of scope (implicit return None)

Reviewed By: thatch

Differential Revision: D17749640

fbshipit-source-id: 9f1be673cf877ff193a0379a0208d037dd2d7bae
2019-10-04 15:24:23 -07:00
Stefan Filip
1d616cebe0 manifest: update test configs to enable rustmanifest by default
Summary:
Rust manifests are the future. Tests work well with them and we are looking
ahead at rolling them out. To protect against regressions on the Rust manifest
side we need to have an automated test story.

I think that the most effective thing we can do is to enable Rust Manifest for
all tests. The C++ codebase is not seeing any kind of modification and we plan
to delete as soon as Rust manifests roll out.

Reviewed By: xavierd

Differential Revision: D17728414

fbshipit-source-id: 59979d02e3cece83e73569a43d6fdbb7a29dc66d
2019-10-04 10:52:13 -07:00
Aleksei Kulikov
1c2460479e snapshot: clean the internal files given the --clean flag
Summary: `merge/*` and `rebasestate` files.

Reviewed By: mitrandir77

Differential Revision: D17719134

fbshipit-source-id: a7ed2258396e8b758debdb9fa28c953b0a22e749
2019-10-04 10:07:55 -07:00
Aleksei Kulikov
b802a42448 snapshot: store metadata content
Summary:
Make it possible to store small files directly inside the metadata blob:
```
{
  "files" : {
    "unknown": {
      "some/big/untracked/file": {"oid": "oid0"},
      "some/small/untracked/file": {"content": <binary file contents>},
    },
    . . .
  },
  "version" : "version of metadata format"
}
```

Reviewed By: mitrandir77

Differential Revision: D17689055

fbshipit-source-id: 7ab18d8f012a20be04d65fcbe2365ff0157386f1
2019-10-04 10:07:54 -07:00
Aleksei Kulikov
a45cf73590 snapshot: fix smartlog predicate for snapshots
Summary: This is a better way to wrap smartlog functionality.

Reviewed By: mitrandir77

Differential Revision: D17686738

fbshipit-source-id: 770b32c19c89a59aee0f8b25251faaae7685a4dd
2019-10-04 10:07:54 -07:00
Aleksei Kulikov
41c7d139af snapshot: fix graphnode and phabstatus templates for snapshots
Summary: The graphnode templatekw must be wrapped in the keywords dict, and the phabstatus template needs to work if the requested revision is hidden.

Reviewed By: mitrandir77

Differential Revision: D17685575

fbshipit-source-id: 5e4a8b45dfe6564af3f72de3511d834aa498d154
2019-10-04 10:07:53 -07:00
Jun Wu
70b566d008 indexedlog: mark NotFound during mmap as data corruption
Summary:
Right now, not being able to find the mmap file can be seen as data corruption.
The only case that NotFound needs special handling is at open time.

This fixes some cases covered by an upcoming test about `repair`.

Reviewed By: xavierd

Differential Revision: D17741999

fbshipit-source-id: 1bd7c65c5a6381892723b31e2e749b22081e96d2
2019-10-03 19:57:32 -07:00
Jun Wu
6735395ee0 indexedlog: remove failure
Summary: `indexedlog` now no longer depends on `failure`.

Reviewed By: xavierd

Differential Revision: D17732135

fbshipit-source-id: 79526dcfa0b5e5a11baca1395573c2aea9c9cc12
2019-10-03 19:57:32 -07:00
Jun Wu
d12269cbcc indexedlog: add context for public RotateLog APIs
Summary:
Similar to the previous change, add context for RotateLog APIs.

This shows error context that might replace backtrace. For example, run:

  cargo test --test low_fileno_limit -- --nocapture

An example (complicated) error:

  "/tmp/.tmp7kTUWt/rotatelog/1": cannot create tempfile
  in log::OpenOptions::open("/tmp/.tmp7kTUWt/rotatelog/1")
    OpenOptions = OpenOptions { index_defs: ["key1"], fsync: false, create: true, checksum_type: Auto, flush_filter: None }
  cannot create new empty log after failing to read existing logs
  in rotate::OpenOptions::open("/tmp/.tmp7kTUWt/rotatelog")
    OpenOptions = OpenOptions { max_bytes_per_log: 50, max_log_count: 20, recovery_policy: ROTATE_ON_CORRUPTED_LATEST_LOG, log_open_options: OpenOptions { index_defs: ["key1"], fsync: false, create: true, checksum_type: Auto, flush_filter: None } }
  Caused by 2 errors:
  - Custom { kind: Other, error: PathError { path: "/tmp/.tmp7kTUWt/rotatelog/1/.tmp6dorJq", err: Os { code: 24, kind: Other, message: "Too many open files" } } }
  - "/tmp/.tmp7kTUWt/rotatelog": no valid logs found
    Caused by 1 errors:
    - "/tmp/.tmp7kTUWt/rotatelog/0/index-key1.sum": cannot open checksum file
      in ChecksumTable::new
      in index::OpenOptions::open("/tmp/.tmp7kTUWt/rotatelog/0/index-key1")
        OpenOptions = OpenOptions { checksum_chunk_size: 1048576, fsync: false, len: Some(0), write: None, key_buf: Some(_) }
      in log::OpenOptions::open("/tmp/.tmp7kTUWt/rotatelog/0")
        OpenOptions = OpenOptions { index_defs: ["key1"], fsync: false, create: false, checksum_type: Auto, flush_filter: None }
      Caused by 1 errors:
      - Os { code: 24, kind: Other, message: "Too many open files" }

(Ignoring whitespace will make this diff much easier to review)

Reviewed By: xavierd

Differential Revision: D17732131

fbshipit-source-id: b1685ded5c76c1200b9c1985749bd67588df1fb3
2019-10-03 19:57:31 -07:00
Jun Wu
7889031092 indexedlog: migrate RotateLog to use new Error type
Summary: Now all `indexedlog` APIs use the new new Error type.

Reviewed By: xavierd

Differential Revision: D17732136

fbshipit-source-id: 8d306a08d8e8052d1c5e68fc5f05a9eed5c7d21f
2019-10-03 19:57:31 -07:00
Jun Wu
029f233d32 indexedlog: make atomic_write return new Error type
Summary: This provides more details, and makes callsites simpler.

Reviewed By: xavierd

Differential Revision: D17732127

fbshipit-source-id: 0fe6dedee4ebb8874ea95505c86d8b107e3367ff
2019-10-03 19:57:31 -07:00
Jun Wu
0a045becd1 indexedlog: add error context for public Log APIs
Summary:
Similar to the previous change, add context for Log APIs.

This shows error context that might replace backtrace. For example, run:

  cargo test --test low_fileno_limit -- --nocapture

An example error looks like:

  "/tmp/.tmpjrsfQt/rotatelog/1/index-key1": cannot duplicate file descriptor
  in ChecksumTable::try_clone
  in Index::try_clone
    Index.path = "/tmp/.tmpjrsfQt/rotatelog/1/index-key1"
  in Log::sync
    Log.dir = Some("/tmp/.tmpjrsfQt/rotatelog/1")
  Caused by 1 errors:
  - Os { code: 24, kind: Other, message: "Too many open files" }

(Ignoring whitespace will make this diff much easier to review)

Reviewed By: xavierd

Differential Revision: D17732124

fbshipit-source-id: b0d500652d80b4a4755453c69bc05d467ecbdf90
2019-10-03 19:57:30 -07:00
Jun Wu
b5708c5caa indexedlog: add error context for public Index APIs
Summary:
Since we lost backtrace by opting out failure, it'd be nice to restore some
"backtrace" information like what Index function is being called.

This diff adds it. It also includes more context like what key is being looked
up so it might actually be more useful than backtrace.

(Ignoring whitespace will make this diff much easier to review)

Reviewed By: xavierd

Differential Revision: D17732126

fbshipit-source-id: 8e5a2c714bee8a943076818f0cff3a21498a954e
2019-10-03 19:57:30 -07:00
Jun Wu
7d6d6ebfb0 indexedlog: migrate Log to use new Error type
Summary: This basically involves adding contexts for io::Error and other error types.

Reviewed By: xavierd

Differential Revision: D17732130

fbshipit-source-id: 79fb3b93d57562f1922f3990a8bda0018d2675e8
2019-10-03 19:57:30 -07:00
Jun Wu
48ceb99202 indexedlog: add utils::mmap_len
Summary: The new utlity function makes it easier to deal with mmap errors.

Reviewed By: xavierd

Differential Revision: D17732139

fbshipit-source-id: 93c8209b983d51198ebb367db983a2e9bc498d63
2019-10-03 19:57:29 -07:00
Jun Wu
f9f969319d indexedlog: add directory locking utilities
Summary: This makes it easier to lock a directory and makes error handling easier.

Reviewed By: xavierd

Differential Revision: D17732133

fbshipit-source-id: a404d41c0aaee7aad43271433f1352a8aa06bccb
2019-10-03 19:57:29 -07:00
Jun Wu
eb53228f47 indexedlog: migrate part of Index to new Error type (7)
Summary:
Migrate the remaining part of Index functions to use the new Error type. This
gives us an accurate view about whether an error indicates data corruption or
not, and makes the code more friendly - it works with `std::error::Error` now.

Reviewed By: xavierd

Differential Revision: D17705168

fbshipit-source-id: 8ae518602e7379d121e718a08127f0873f2e2423
2019-10-03 19:57:29 -07:00
Jun Wu
0def09884f indexedlog: migrate part of Index to new Error type (6)
Summary:
Migrate some return types from Fallible to the new Result. The main changes are
the way `io::Result` gets handled. The new API enforces attaching a `path` and
a message to them.

Reviewed By: xavierd

Differential Revision: D17705163

fbshipit-source-id: d060bdb2846a75c588b99201fd07ca3872f3a358
2019-10-03 19:57:29 -07:00
Jun Wu
0740e20b83 indexedlog: migrate part of Index to new Error type (5)
Summary:
Migrate more free-form errors handling like `data_error`, `parameter_error`
to the new Error type.

Reviewed By: xavierd

Differential Revision: D17705164

fbshipit-source-id: 45560a96e36fb5e83a9e365506e27c201f9448a6
2019-10-03 19:57:28 -07:00
Jun Wu
bfc618ea06 indexedlog: migrate part of Index to new Error type (4)
Summary:
Migrate `range_error` and `verify_checksum` to the `IndexBuf` trait so they
all get path information on error. Remove the free-form `range_error` and
`verify_checksum` functions.

Reviewed By: xavierd

Differential Revision: D17705165

fbshipit-source-id: 556fda8081c69b6beccc8c666902810a90635231
2019-10-03 19:57:28 -07:00
Jun Wu
de36889bf6 indexedlog: migrate part of Index to new Error type (3)
Summary:
A lot of functions take (buf, checksum) tuple, instead of `Index` for input.
That is to avoid issues where borrowing the entire `Index` forbids modifying
other fields in `Index`.

However, not taking `Index` means it cannot figure out the file path on error.

To solve both problems, this diff defines a trait that is a subset of Index
including (on-disk buf, checksum, path). Then migrate functions from using
(buf, checksum) to the new trait (if it only needs to read from the on-disk
buffer), or &Index (if it also needs to work with in-memory dirty/mutable
data).

Reviewed By: xavierd

Differential Revision: D17705166

fbshipit-source-id: 90bde88142ea3718a2093beb02b8030d725a0e15
2019-10-03 19:57:28 -07:00
Jun Wu
3a8a96388d indexedlog: migrate part of Index to new Error type (2)
Summary:
Change some `range_error` to `Index::range_error`.
The new error is better because it includes path information.

Reviewed By: xavierd

Differential Revision: D17705162

fbshipit-source-id: 1de1c7cdd730fcf7c6c39e9e5840939fa561bc33
2019-10-03 19:57:27 -07:00
Jun Wu
2b842d8c79 indexedlog: migrate part of Index to new Error type (1)
Summary:
Change `read_bitmap_unchecked` and `read_raw_int_unchecked` to use the new
Error type. Change their function signature from taking `&[u8]` to taking
`&Index` so we can get the file path in the error message.

Reviewed By: xavierd

Differential Revision: D17705167

fbshipit-source-id: 82bcbe21061cdf993d5c7f9867941c1f936166e5
2019-10-03 19:57:27 -07:00
Jun Wu
52f8171869 indexedlog: migrate ChecksumTable to new Error type
Summary:
Migrate to the new Error type so we can know whether an error is considered
as a data corruption. The new Error should also provide more explicit error
messages.

(This diff is easier to review if whitespace changes are all ignored)

Reviewed By: xavierd

Differential Revision: D17696536

fbshipit-source-id: bfceffbf75a75940a90c914da7914a601d75a747
2019-10-03 19:57:27 -07:00
Jun Wu
0f3fda039d indexedlog: define a way to convert io::Result to Result
Summary:
`io::Result` is widely used in indexedlog internal and they need to be
converted to `Result`.

This diff defines the conversion function. It enforces 2 context parameters:
- File path.
- What operations is it? This is needed since we will lose the backtrace.

Reviewed By: xavierd

Differential Revision: D17696533

fbshipit-source-id: d9417a6b65cbfbb5d6d7d1c6449ddd13e3035b5c
2019-10-03 19:57:26 -07:00
Jun Wu
d58e5c3984 indexedlog: define new error types
Summary:
I need to make RotateLog understand whether errors occured in Log/std::io/Index
are data corruption or not. To be explicit, I defined a `is_data_corruption`
method. Downcasting a chain does not look like a confident solution (ex. less
confident to check that it covers all possible cases).

There are other motivations for this change:
- `failure`: it is unfriendly in a low level library; it requires callsites to
  use failure, too. `failure` is less maintained - it still provides the nice
  backtrace feature but it's more friendly if libraries just use std Error (we
  lose backtrace inside the library, but hopefully the errors are in a high
  quality so backtrace in the application is enough for debugging).
- Error with multi-sources. Both std and failure Error provides one slot for
  "cause". Sometimes it's desirable to use multiple slots. For example,
  RotateLog::open fails to read existing logs, and also fails to auto recover
  by creating a new log. In that case, ideally we keep both errors in the
  returned type.

Reviewed By: xavierd

Differential Revision: D17696532

fbshipit-source-id: 0387b3a3b71f097b1a3dc2dcc7671a43c465abb2
2019-10-03 19:57:26 -07:00