Commit Graph

30 Commits

Author SHA1 Message Date
Jun Wu
b178492317 changelog: disable inline revlog
Summary:
The inline revlog format merges `.i` and `.d` into one `.i` file. It was intended to reduce the
number of files for filelogs. For the changelog one extra file does not hurt.

This makes it easier to write native code parsing the changelog revlog index.

Reviewed By: xavierd

Differential Revision: D17125922

fbshipit-source-id: f48ffe0d2df71abec007a80e05b684dcbac71883
2019-08-30 14:58:02 -07:00
Durham Goode
ad813edcbd treemanifest: enable treemanifest by default in tests
Summary:
Now that all our repos are treemanifest, let's enable the extension by
default in tests. Once we're certain no one needs it in production we'll also
make it the default in core Mercurial.

This diff includes a minor fix in treemanifest to be aware of always-enabled
extensions. It won't matter until we actually add treemanifest to the list of
default enabled extensions, but I caught this while testing things.

Reviewed By: ikostia

Differential Revision: D15030253

fbshipit-source-id: d8361f915928b6ad90665e6ed330c1df5c8d8d86
2019-05-28 03:17:02 -07:00
Jun Wu
d03f2d26c2 manifest: drop manifestv2 support
Summary:
The upstream has removed it in https://phab.mercurial-scm.org/D2393. Do the
same.

The deleted C++ code seems to leak `Py_False` if `usemanifestv2` is not set.

Reviewed By: singhsrb

Differential Revision: D14611525

fbshipit-source-id: d828526c31aaa861d100a846bba79d1f5898e245
2019-03-26 13:32:45 -07:00
Mark Thomas
08567ee311 localrepo: don't add storerequirements by default
Summary:
This allows versions that don't know about storerequirements still access newly
created repos with this version.  We will turn this on at a later date.

Reviewed By: singhsrb

Differential Revision: D10033964

fbshipit-source-id: e1065e05c33544d0287eda5eb852baff07c13147
2018-09-25 12:37:57 -07:00
Mark Thomas
0ca4acd250 localrepo: add storerequirements feature
Summary:
Add the `storerequirements` feature to the repo.  This means the store may have
a `requires` file, and clients must check it for any store features that they
may be missing.  This allows new requirements to be added that affect the store
when the repo is shared.  Currently there are no store features.

This commit adds support for the feature, and only new repos have the
requirement added.  A future commit will optimistically upgrade repos to
include the requirement.

Reviewed By: quark-zju

Differential Revision: D9699156

fbshipit-source-id: 95c1ab6973d44c02abc69b78a15311fe6a8696fd
2018-09-15 03:22:34 -07:00
Jun Wu
d26a9397e6 dirstate: unify format configs
Summary:
Previously, there are 2 configs: `treedirstate.useinnewrepos` and
`format.usetreestate`. They are both related to dirstate format and conflict
with each other. This patch unifies them into a single config
`format.dirstate`.

As we're here, merge `test-fb-hgext-treedirstate-x.t` to `test-dirstate-x.t`
if they were previously copied from `test-dirstate-x.t`

Reviewed By: markbt

Differential Revision: D8393878

fbshipit-source-id: 57abeea22ce732d93205e4d4308923afa90693f4
2018-06-13 18:17:26 -07:00
Jun Wu
77638ffcc0 treedirstate: actually enable it in tests
Summary:
Previously it is not actually used.

`test-hgext-repogenerator.t` changed because treedirstate uses random
number to generate file names.

`fakedirstatewritetime.py` was updated to be treedirstate-aware. This
makes test-revert.t test-merge-tools.t test-merge1.t pass.

Reviewed By: singhsrb

Differential Revision: D7844960

fbshipit-source-id: 33a1d0d4a8e22ea5e6bb6454956884571fcf6bab
2018-05-02 17:15:36 -07:00
Jun Wu
e81c53461e largefiles: remove the extension
Summary:
`lfs` is the better large file solution. `largefiles` is rarely used, and
its implementation is less clean. So let's remove it.

Test Plan:
Ran all tests. A subrepo test was removed instead of cleaned up since the
longer term plan is to also drop subrepo support.

Reviewers: phillco, #mercurial

Reviewed By: phillco

Differential Revision: https://phabricator.intern.facebook.com/D6740361

Signature: 6740361:1516225594:555e3803571ad05e0434021897a2823ac99347ae
2018-01-17 11:50:44 -08:00
Yuya Nishihara
822286f829 debugformat: embed raw values in JSON and template output 2017-12-10 19:41:49 +09:00
Yuya Nishihara
4087d826c3 debugformat: flush formatter output per item 2017-12-10 19:39:39 +09:00
Matt Harbison
3e757c482b lfs: restore the local blob store after a repo upgrade
This also ends up testing the local extension wrapping for dstrepo during
upgrade, which was fixed in f0a28956f345.
2017-12-08 00:18:30 -05:00
Matt Harbison
b798276c65 tests: add coverage for preserving 'lfs' requirement on repo upgrade
The test also shows that the local blob store is erroneously lost.
2017-12-07 22:36:31 -05:00
Matt Harbison
0474cfef45 test-upgrade-repo: glob away timing values 2017-12-07 22:35:19 -05:00
Boris Feld
d8aa8e36d8 upgrade: add a 'redeltafullall' mode
We add a new mode for delta recomputation, when selected, each full text will
go through the full "addrevision" mechanism again. This is slower than
"redeltaall" but this gives the opportunity for extensions to trigger special
logic. For example, the lfs extensions can decide to promote some revision to
lfs storage during the upgrade.
2017-12-07 20:27:03 +01:00
Boris Feld
826e838aa1 upgrade: use the repository 'ui' as the base for the new repository
The `repo.baseui` contains all the configuration but the one specific to the
repository (so it can be used when dealing with local peer and sub-
repository). However, we need the repository config to be taken into account
when doing the upgrade. Otherwise, the upgrade related config that exists in
the repository config won't be taken into account when performing the update.
A buggy and surprising behavior.

We had to work around protection set around `repo.ui.copy` since we are an
uncommon case.
2017-12-07 18:55:35 +01:00
Boris Feld
5c9784830a upgrade: add a test to show the repository config being ignored
The upgrade process ignores the config within the repository. The next
changeset fixes it, but we introduce this test before to show it actually
tests our target.
2017-12-07 20:50:24 +01:00
Boris Feld
9f250d48c5 upgrade: register compression as a format variants
Compression is a promising vector for speedup, let us make it easier to check
the compression used and upgrade existing repository.
2017-12-07 16:50:48 +01:00
Boris Feld
c2382c14ab debugformat: update label depending on value difference
The new label highlight areas where the repo format differs from current
config or default. This should help people spot area where a repository
mismatch with the expected state.
2017-12-07 16:12:32 +01:00
Boris Feld
5ef878754c debugformat: add data about the config when verbose
In verbose mode, the command also displays the current configuration choice
for the variant and the global Mercurial default for it.
2017-12-07 16:05:20 +01:00
Boris Feld
3b270be25b debugformat: add a 'debugformat' command
The command displays basic data about all format variants registered for repo
upgrades. This gives a quick way to peek into a repository format.

The 'fm.write()' calls are very independent because more data will be added in
later changeset. Having more separate call make the later patch clearer.
2017-12-07 16:19:46 +01:00
Boris Feld
d36b88cc11 largefiles: allow to run 'debugupgraderepo' on repo with largefiles
The extensions wrap the necessary function to ensure the 'largefiles'
requirements won't be dropped.

It is now possible to run `hg debugupgraderepo` on a repository with largefiles.
2017-12-07 01:53:14 +01:00
Gregory Szorc
34d4f7ff46 repair: use rawvfs when copying extra store files
If we use the normal vfs, store encoding will be applied when we
.join() the path to be copied. This results in attempting to copy
a file that (likely) doesn't exist. Using the rawvfs operates on
the raw file path, which is returned by vfs.readdir().

Users at Mozilla are encountering this, as I've instructed them to
run `hg debugupgraderepo` to upgrade to generaldelta. While Mercurial
shouldn't deposit any files under .hg/store that require encoding, it
is possible for e.g. .DS_Store files to be created by the operating
system.
2017-04-08 11:36:39 -07:00
Gregory Szorc
93df60bae1 tests: add test demonstrating buggy path handling
`hg debugupgraderepo` is currently buggy with regards to path
handling when copying files in .hg/store/. Specifically, it applies
the store filename encoding to paths instead of operating on raw
files.

This commit adds a test demonstrating the buggy behavior.
2017-04-08 11:35:29 -07:00
Gregory Szorc
abe1c0e17e repair: clean up stale lock file from store backup
Since we did a directory rename on the stores, the source
repository's lock path now references the dest repository's
lock path and the dest repository's lock path now references
a non-existent filename.

So releasing the lock on the source will unlock the dest and
releasing the lock on the dest will no-op because it fails due
to file not found. So we clean up the dest's lock manually.
2016-11-24 18:45:29 -08:00
Gregory Szorc
a400e3d753 repair: copy non-revlog store files during upgrade
The store contains more than just revlogs. This patch teaches the
upgrade code to copy regular files as well.

As the test changes demonstrate, the phaseroots file is now copied.
2016-11-24 18:34:50 -08:00
Gregory Szorc
93504084a0 repair: migrate revlogs during upgrade
Our next step for in-place upgrade is to migrate store data. Revlogs
are the biggest source of data within the store and a store is useless
without them, so we implement their migration first.

Our strategy for migrating revlogs is to walk the store and call
`revlog.clone()` on each revlog. There are some minor complications.

Because revlogs have different storage options (e.g. changelog has
generaldelta and delta chains disabled), we need to obtain the
correct class of revlog so inserted data is encoded properly for its
type.

Various attempts at implementing progress indicators that didn't lead
to frustration from false "it's almost done" indicators were made.

I initially used a single progress bar based on number of revlogs.
However, this quickly churned through all filelogs, got to 99% then
effectively froze at 99.99% when it got to the manifest.

So I converted the progress bar to total revision count. This was a
little bit better. But the manifest was still significantly slower
than filelogs and it took forever to process the last few percent.

I then tried both revision/chunk bytes and raw bytes as the
denominator. This had the opposite effect: because so much data is in
manifests, it would churn through filelogs without showing much
progress. When it got to manifests, it would fill in 90+% of the
progress bar.

I finally gave up having a unified progress bar and instead implemented
3 progress bars: 1 for filelog revisions, 1 for manifest revisions, and
1 for changelog revisions. I added extra messages indicating the total
number of revisions of each so users know there are more progress bars
coming.

I also added extra messages before and after each stage to give extra
details about what is happening. Strictly speaking, this isn't
necessary. But the numbers are impressive. For example, when converting
a non-generaldelta mozilla-central repository, the messages you see are:

   migrating 2475593 total revisions (1833043 in filelogs, 321156 in manifests, 321394 in changelog)
   migrating 1.67 GB in store; 2508 GB tracked data
   migrating 267868 filelogs containing 1833043 revisions (1.09 GB in store; 57.3 GB tracked data)
   finished migrating 1833043 filelog revisions across 267868 filelogs; change in size: -415776 bytes
   migrating 1 manifests containing 321156 revisions (518 MB in store; 2451 GB tracked data)

That "2508 GB" figure really blew me away. I had no clue that the raw
tracked data in mozilla-central was that large. Granted, 2451 GB is in
the manifest and "only" 57.3 GB is in filelogs. But still.

It's worth noting that gratuitous loading of source revlogs in order
to display numbers and progress bars does serve a purpose: it ensures
we can open all source revlogs. We don't want to spend several minutes
copying revlogs only to encounter a permissions error or similar later.

As part of this commit, we also add swapping of the store directory
to the upgrade function. After revlogs are converted, we move the
old store into the backup directory then move the temporary repo's
store into the old store's location. On well-behaved systems, this
should be 2 atomic operations and the window of inconsistency show be
very narrow.

There are still a few improvements to be made to store copying and
upgrading. But this commit gets the bulk of the work out of the way.
2016-12-18 17:00:15 -08:00
Gregory Szorc
b9b6954ea9 repair: begin implementation of in-place upgrading
Now that all the upgrade planning work is in place, we can start
doing the real work: actually upgrading a repository.

The main goal of this commit is to get the "framework" for running
in-place upgrade actions in place.

Rather than get too clever and low-level with regards to in-place
upgrades, our strategy is to create a new, temporary repository,
copy data to it, then replace the old data with the new. This allows
us to reuse a lot of code in localrepo.py around store interaction,
which will eventually consume the bulk of the upgrade code.

But we have to start small. This patch implements adding new
repository requirements. But it still sets up a temporary
repository and locks it and the source repo before performing the
requirements file swap. This means all the plumbing is in place
to implement store copying in subsequent commits.
2016-12-18 16:59:04 -08:00
Gregory Szorc
a3569d4b71 repair: determine what upgrade will do
This commit introduces code for determining what actions/improvements
an upgrade should perform.

The "upgradefindimprovements" function introduces a mechanism to
return a list of improvements that can be made to a repository.
Each improvement is effectively an action that an upgrade will
perform. Associated with each of these improvements is metadata
that will be used to inform users what's wrong and what an
upgrade will do.

Each "improvement" is categorized as a "deficiency" or an
"optimization." TBH, I'm not thrilled about the terminology and
am receptive to constructive bikeshedding. The main difference
between a "deficiency" and an "optimization" is a deficiency
is always corrected (if it deviates from the current config) and
an "optimization" is an optional action that goes above and beyond
to improve the state of the repository (usually by requiring more
CPU during upgrade).

Our initial set of improvements identifies missing repository
requirements, a single, easily correctable problem with
changelog storage, and a set of "optimizations" related to delta
recalculation.

The main "upgraderepo" function has been expanded to handle
improvements. It queries for the list of improvements and determines
which of them will run based on the current repository state and user

I went through numerous iterations of the output format before
settling on a ReST-inspired definition list format. (I used
bulleted lists in the first submission of this commit and could
not get it to format just right.) Even with the various iterations,
I'm still not super thrilled with the format. But, this is a debug*
command, so that should mean we can refine the output without BC
concerns.
2016-12-18 16:51:09 -08:00
Gregory Szorc
f42e2dcaac repair: implement requirements checking for upgrades
This commit introduces functionality for upgrading a repository in
place. The first part that's implemented is testing for upgrade
"compatibility." This is done by examining repository requirements.

There are 5 functions returning sets of requirements that control
upgrading. Why so many functions? Mainly to support extensions.
Functions are easier to monkeypatch than module variables.

Astute readers will see that we don't support "manifestv2" and
"treemanifest" requirements in the upgrade mechanism. I don't have
a great answer for why other than this is a complex set of patches
and I don't want to deal with the complexity of these experimental
features just yet. We can teach the upgrade mechanism about them
later, once the basic upgrade mechanism is in place.

This commit also introduces the "upgraderepo" function. This will be
our main routine for performing an in-place upgrade. Currently, it
just implements requirements checking. The structure of some code in
this function may look a bit weird (e.g. the inline function that is
only called once). But this will make sense after future commits.
2016-12-18 16:16:54 -08:00
Gregory Szorc
16568ee7f0 debugcommands: stub for debugupgraderepo command
Currently, if Mercurial introduces a new repository/store feature or
changes behavior of an existing feature, users must perform an
`hg clone` to create a new repository with hopefully the
correct/optimal settings. Unfortunately, even `hg clone` may not
give the correct results. For example, if you do a local `hg clone`,
you may get hardlinks to revlog files that inherit the old state.
If you `hg clone` from a remote or `hg clone --pull`, changegroup
application may bypass some optimization, such as converting to
generaldelta.

Optimizing a repository is harder than it seems and requires more
than a simple `hg` command invocation.

This commit starts the process of changing that. We introduce
`hg debugupgraderepo`, a command that performs an in-place upgrade
of a repository to use new, optimal features. The command is just
a stub right now. Features will be added in subsequent commits.

This commit does foreshadow some of the behavior of the new command,
notably that it doesn't do anything by default and that it takes
arguments that influence what actions it performs. These will be
explained more in subsequent commits.
2016-11-24 16:24:09 -08:00