Summary:
Directory listing is different in every OS, and due to the current repack
implementation, this directly affect the order in which the packfiles are added
to the new one. Since the resulting packfile name depends on the hash of its
content, the name was influenced by the directory order.
By sorting the files in list_packs, the packfile name will be independent of
the directory listing and thus be the same for all the OSes.
Reviewed By: singhsrb
Differential Revision: D13700935
fbshipit-source-id: 01e055a0c1bcf7fb2dc4faf614dfb20cd4499017
Summary: For now, combine all files smaller than 100MB that accumulate to less than 4GB.
Reviewed By: DurhamG
Differential Revision: D13603760
fbshipit-source-id: 3fa74f1ced3d3ccd463af8f187ef5e0254e1820b
Summary: Use the newly introduced PackWriter to write the {data,history}packs.
Reviewed By: markbt
Differential Revision: D13603759
fbshipit-source-id: 528a6af7c4ac3321aeec0559805de12114224cfd
Summary:
The packfiles are currently being written via an unbuffered file. This is
inefficient as every write to the file results results in a write(2) syscall.
By buffering these writes we can reduce the number of syscalls and thus
increase the throughput of pack writing operations.
Reviewed By: markbt
Differential Revision: D13603758
fbshipit-source-id: 649186a852d427a1473695b1d32cc9cd87a74a75
Summary:
The revisionstore crate currently consists of several public submodules,
each exposing several public types. The APIs exposed by each of the modules
require using types from the other modules. As such, users of this crate are
forced to have complex nested imports to use any of its functionality.
This diff helps ease this problem by reexporting the public types exposed from
each of the public submodules at the top level, thereby allowing crate users to
`use` all of the required types without needing nested imports.
Reviewed By: singhsrb
Differential Revision: D13686913
fbshipit-source-id: 9fb3cce8783787aa5f3f974c7168afada5952712
Summary:
The later tries to read from the disk, while the former is purely in memory and
thus more efficient.
Reviewed By: DurhamG, markbt
Differential Revision: D13603757
fbshipit-source-id: 5fd120ba4065d6a65cb2982db9ab81db3ea26524
Summary:
On some platforms, removing a file can fail if it's still mapped or opened. In
mercurial, this can happen during repack as the datapacks are removed while
still being mapped.
Reviewed By: DurhamG
Differential Revision: D13615938
fbshipit-source-id: fdc1ff9370e2767e52ee1828552f4598105f784f
Summary:
After repacking the data/history packs, we need to cleanup the
repacked files. This was an omission from D13363853.
Reviewed By: markbt
Differential Revision: D13577592
fbshipit-source-id: 36e7d5b8e86affe47cdd10d33a769969f02b8a62
Summary:
The python version of the mutable packs set the permission to read-only after
writing them, while the rust version keeps them writeable. Let's make the rust
one more consistent.
Reviewed By: markbt
Differential Revision: D13573572
fbshipit-source-id: 61256994562aa09058a88a7935c16dfd7ddf9d18
Summary: The former is deprecated and thus compiling revisionstore shows many warnings.
Reviewed By: markbt
Differential Revision: D13379278
fbshipit-source-id: d4b4662a1ad00997de4c46274deaf22f48487328
Summary:
The future of mercurial is rust, and one of the missing piece is repacking of data/history packs. For now, let's implement a very basic packing strategy that just pulls all the packs into one, with one small optimization that puts all the delta chains close together in the output file.
At first, it's expected that this code will be driven by the existing python code, but more and more will be done in rust as time goes.
Reviewed By: DurhamG
Differential Revision: D13363853
fbshipit-source-id: ad1ac2039e1732f7141d99abf7f01804a9bde097
Summary:
Add "--dry-run" for fix-code.py and use it in test-check.
This avoids license header and version = "*" issues.
Reviewed By: ikostia
Differential Revision: D10213070
fbshipit-source-id: 9fdd49ead3dfcecf292d5f42c028f20e5dde65d3
Summary:
This is done by running `fix-code.py`. Note that those strings are
semvers so they do not pin down the exact version. An API-compatiable upgrade
is still possible.
Reviewed By: ikostia
Differential Revision: D10213073
fbshipit-source-id: 82f90766fb7e02cdeb6615ae3cb7212d928ed48d
Summary:
This diff implements getBlob on top of the mercurial rust
datapack code. It adds a C++ binding on top of the rust code to
make it easier to use and hooks it up in the hg backing store.
Need to figure this out for our opensource and windows builds:
* Need to teach them how to build and link the rust code
* need to add a windows version of the methods that accept paths;
this is just a matter of adding a WCHAR version of the functions.
Reviewed By: strager
Differential Revision: D10433450
fbshipit-source-id: 45ce34fb9c383ea6018a0ca858581e0fe11ef3b5
Summary:
If the rust pack stores are used to access truncated pack files, currently they
panic. Instead, return a proper error showing what's wrong.
Reviewed By: quark-zju
Differential Revision: D10868299
fbshipit-source-id: 57fe5ec1ee4ee2a7bb10d2d5c5ca7082dc34125d
Summary: This is just the result of running `./contrib/fix-code.py $(hg files .)`
Reviewed By: ikostia
Differential Revision: D10213075
fbshipit-source-id: 88577c9b9588a5b44fcf1fe6f0082815dfeb363a
Summary:
The histpack format requires that entries in each file section be
written in topological order, so that future readers can compute ancestors by
just linearly scanning. Let's make the rust mutable history pack support this.
Technically the rust historypack reader does not require this for now, but the python
one does, so we need to enforce it.
Reviewed By: kulshrax
Differential Revision: D10441286
fbshipit-source-id: dfdb57182909270b760bd79a100873aa3903a2a5
Summary:
During an ancestor traversal, we were adding items to the queue if they
hadn't be processed yet. In a highly merge-y history this could result in adding
an exponential number of items to the queue since we aren't preventing items
from being added until they are actually consumed.
The fix is to just add the items to the seen set as we add them to the queue.
Reviewed By: quark-zju
Differential Revision: D10434655
fbshipit-source-id: 430b51adb2d24a99d8c780031f3dbf22c56b9347
Summary:
The `Node` type will be used in multiple places. Let's move it to a standalone
crate so new libraries depending on it won't need to pull in all of
revisionstore's dependencies.
Note: I'd also like the `types` create to only define clean types. Given the
fact NULL_ID is not a great design in Mercurial (`Option<Node>` is a better
choice in Rust), it probably does not belong to the formal Rust `Node` type.
This diff is merely about moving things with minimal changes. NULL_ID will
be decoupled from `Node` in a follow-up.
Reviewed By: markbt
Differential Revision: D10132047
fbshipit-source-id: 5d05c5e0ac06a2d58556c4db11775503f9495626
Summary:
Copy functions from Mononoke to implement the Display trait
for a Node.
Reviewed By: quark-zju
Differential Revision: D9768566
fbshipit-source-id: 6961026a9e4cdaf4a0f2592dc9284abebadb0aa3
Summary:
This was meant to be in a prior diff but was forgotten. This also
exposes an issue where we aren't producing ancestors in topological order.
Reviewed By: quark-zju
Differential Revision: D9380009
fbshipit-source-id: 6a49f0f31c3e107353f9192ca15cda0b1b9c3693
Summary:
v0 history packs require more complicated and slow logic for looking up
a node. Instead of complicating our rust implementation, let's just not support
v0.
Reviewed By: quark-zju
Differential Revision: D9373395
fbshipit-source-id: 6d28a3684966b55a617619e3cae765b2944919a0
Summary:
When calling get_ancestors with 'partial' enabled, we want to return a
key error if the first key can't be found, but not if later keys can't be found.
Reviewed By: singhsrb
Differential Revision: D9367477
fbshipit-source-id: 0e9ad7ea82f83db7326392accab96bd31318f28e
Summary:
Previously HistoryIndex.write() accepted a vector and a hashmap that
contained Box<[u8]>. This diff changes it to be &Box<[u8]>, which allows us to
avoid a ton of allocations.
Reviewed By: quark-zju
Differential Revision: D9350962
fbshipit-source-id: 3f900c551584e3431202f3a30afd61aa10fbb78c
Summary:
I learned that Box::from() can be used to copy a slice into a box, so
let's replace my previous to_vec().into_boxed_slice() with this.
Reviewed By: quark-zju
Differential Revision: D9350961
fbshipit-source-id: 94053b82cd64923dfabc9acf3a9dab6daca20cf3
Summary:
Only the dataindex needs the actual locations, so let's make the
locations vector optional and only pass it from dataindex.
Based on feedback from an earlier code review.
Reviewed By: quark-zju
Differential Revision: D9350960
fbshipit-source-id: 54ec34e1bd891ae585b22d916664700ce5417353
Summary: Apparently older rust needs a & on these match statements.
Reviewed By: phillco, quark-zju
Differential Revision: D9363026
fbshipit-source-id: fa802464d01b4074546076888e6d5c92155ddf4e
Summary:
Rust doesn't provide a convienent way to do `slice[range]?`, so let's
introduce an extension trait for allowing slice range reads and getting a Result
back.
Reviewed By: markbt
Differential Revision: D9276216
fbshipit-source-id: 9a8cea8ffc062c4a2dd432dd4de7fdd4ccabf8d3
Summary:
Before we can finish the python bindings for HistoryPack, we need to
implement the Repackable trait.
Reviewed By: markbt
Differential Revision: D9273264
fbshipit-source-id: ed181d73c497a84fed5e0c85fad1d7d73ec52e4e
Summary:
Previously ancestor traversals would return an error if they
encountered a node that couldn't be resolved. In some cases we want to support
partial ancestor resolutions (like iterating over part of a history in one
store, while the rest in another store). Let's add an option that decides
whether a missing node is an error or not.
Reviewed By: markbt
Differential Revision: D9231397
fbshipit-source-id: ff3063acfb8da2d453f34221f1865f3123615b0c
Summary:
Now that MutableHistoryPack and HistoryIndex are implemented, we can
put the final HistoryPack code in place. Let's start by putting the boiler plate
definition down.
Reviewed By: markbt
Differential Revision: D9231387
fbshipit-source-id: d90f20603a7e08becda604eeda90d62ef4e88cbb
Summary:
Now that we have serializers for all the individual parts, let's make mutablehistorypack actually serialize.
This is still missing the bit where we topologically sort the nodes before writing them, so that will come later.
Reviewed By: markbt
Differential Revision: D9231401
fbshipit-source-id: 85703a44420bd9eee80fabe1fa4ffb0ebff3ecfd
Summary:
In an upcoming diff we'll begin writing the actual history data pack
file. It is primarily composed of HistoryEntry's so let's implement and unit
test the serialization logic for them.
Reviewed By: markbt
Differential Revision: D9231391
fbshipit-source-id: 1b070ee5d15e06ea0c70a678dc6eb129e0ffaa20
Summary:
In an upcoming diff we'll start writing the actual history pack data
file. The file section headers are part of that file, so let's implement the
serialization logic and unit test it.
Reviewed By: markbt
Differential Revision: D9231400
fbshipit-source-id: eb1494e3b8fe3419f77edcaab25f640b48f16e4b
Summary:
The old ok_or would allocate a string every time. Since this is an
error condition, let's use ok_or_else to only allocate the string if there's an
error.
Reviewed By: markbt
Differential Revision: D9231394
fbshipit-source-id: e4912bfd26925077fffb7c686a7c1c2f3cb36f7c
Summary: Add an API for reading individual nodes from an index.
Reviewed By: markbt
Differential Revision: D9231398
fbshipit-source-id: 1d7c4b34b121cba62bc28eba2323807cfedbaf3b
Summary:
Now that historyindex can serialize, let's add logic to perform file
entry lookups. A later diff will allow node looks.
Reviewed By: markbt
Differential Revision: D9231392
fbshipit-source-id: 9ab8e29ce85c0f372a7a432318d6d903e6c44bcc
Summary:
Adds logic to write the actual index, including the fanout table, the
file name index, and the node indexes.
Reviewed By: markbt
Differential Revision: D9231402
fbshipit-source-id: f382c4a56c5c53b83232b43adb966a7aff3878db
Summary:
Adds the initial reader/writer boiler plate for creating a HistoryIndex
and writing the header.
Reviewed By: markbt
Differential Revision: D9231389
fbshipit-source-id: ece1290416e8cde23a825ee3bd1a555a4ebded35
Summary:
To start the history pack implementation, let's start by implementing
reader/writers for the various parts. In this diff we do the NodeIndexEntry.
Reviewed By: markbt
Differential Revision: D9231403
fbshipit-source-id: 904c1a094e63b0f4cebef84a30a7dd89bdaf1e1f
Summary:
To start the history pack implementation, let's start by implementing
reader/writers for the various parts. In this diff we do the FileIndexEntry
Reviewed By: markbt
Differential Revision: D9231395
fbshipit-source-id: d054959796ee4e3d51df8f3533712f8f959a04d2
Summary:
This moves the ancestor iteration logic for cases where we iterate one
by one. This will be used by the HistoryPack code in upcoming diffs.
Reviewed By: quark-zju
Differential Revision: D9136978
fbshipit-source-id: e60b0a1e2ee5036938b51bbd910fbaf548d7aa75
Summary:
This moves the ancestor iteration logic into it's own class, with
support for cases where we receive bulk sets of ancestors at once. A future diff
will add similar logic for ancestor traversals where we receive one hash at a
time.
Reviewed By: quark-zju
Differential Revision: D9136985
fbshipit-source-id: 7f918476f777020b3436f5104ad3bf4b00fe9827