sapling/eden/integration/hg/histedit_test.py

189 lines
6.4 KiB
Python
Raw Normal View History

#!/usr/bin/env python3
#
# Copyright (c) 2016-present, Facebook, Inc.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree. An additional grant
# of patent rights can be found in the PATENTS file in the same directory.
import os
from textwrap import dedent
from .lib.hg_extension_test_base import EdenHgTestCase, hg_test
Reimplement dirstate used by Eden's Hg extension as a subclass of Hg's dirstate. Summary: This is a major change to Eden's Hg extension. Our initial attempt to implement `edendirstate` was to create a "clean room" implementation that did not share code with `mercurial/dirstate.py`. This was helpful in uncovering the subset of the dirstate API that matters for Eden. It also provided a better safeguard against upstream changes to `dirstate.py` in Mercurial itself. In this implementation, the state transition management was mostly done on the server in `Dirstate.cpp`. We also made a modest attempt to make `Dirstate.cpp` "SCM-agnostic" such that the same APIs could be used for Git at some point. However, as we have tried to support more of the sophisticated functionality in Mercurial, particularly `hg histedit`, achieving parity between the clean room implementation and Mercurial's internals has become more challenging. Ultimately, the clean room implementation is likely the right way to go for Eden, but for now, we need to prioritize having feature parity with vanilla Hg when using Eden. Once we have a more complete set of integration tests in place, we can reimplement Eden's dirstate more aggressively to optimize things. Fortunately, the [[ https://bitbucket.org/facebook/hg-experimental/src/default/sqldirstate/ | sqldirstate ]] extension has already demonstrated that it is possible to provide a faithful dirstate implementation that subclasses the original `dirstate` while using a different storage mechanism. As such, I used `sqldirstate` as a model when implementing the new `eden_dirstate` (distinguishing it from our v1 implementation, `edendirstate`). In particular, `sqldirstate` uses SQL tables as storage for the following private fields of `dirstate`: `_map`, `_dirs`, `_copymap`, `_filefoldmap`, `_dirfoldmap`. Because `_filefoldmap` and `_dirfoldmap` exist to deal with case-insensitivity issues, we do not support them in `eden_dirstate` and add code to ensure the codepaths that would access them in `dirstate` never get exercised. Similarly, we also implemented `eden_dirstate` so that it never accesses `_dirs`. (`_dirs` is a multiset of all directories in the dirstate, which is an O(repo) data structure, so we do not want to maintain it in Eden. It appears to be primarily used for checking whether a path to a file already exists in the dirstate as a directory. We can protect against that in more efficient ways.) That leaves only `_map` and `_copymap` to worry about. `_copymap` contains the set of files that have been marked "copied" in the current dirstate, so it is fairly small and can be stored on disk or in memory with little concern. `_map` is a bit trickier because it is expected to have an entry for every file in the dirstate. In `sqldirstate`, it is stored across two tables: `files` and `nonnormalfiles`. For Eden, we already represent the data analogous to the `files` table in RocksDB/the overlay, so we do not need to create a new equivalent to the `files` table. We do, however, need an equivalent to the `nonnormalfiles` table, which we store in as Thrift-serialized data in an ordinary file along with the `_copymap` data. In our Hg extension, our implementation of `_map` is `eden_dirstate_map`, which is defined in a Python file of the same name. Our implementation of `_copymap` is `dummy_copymap`, which is defined in `eden_dirstate.py`. Both of these collections are simple pass-through data structures that translate their method calls to Thrift server calls. I expect we will want to optimize this in the future via some client-side caching, as well as creating batch APIs for talking to the server via Thrift. One advantage of this new implementation is that it enables us to delete `eden/hg/eden/overrides.py`, which overrode the entry points for `hg add` and `hg remove`. Between the recent implementation of `dirstate.walk()` for Eden and this switch to the real dirstate, we can now use the default implementation of `hg add` and `hg remove` (although we have to play some tricks, like in the implementation of `eden_dirstate.status()` in order to make `hg remove` work). In the course of doing this revision, I discovered that I had to make a minor fix to `EdenMatchInfo.make_glob_list()` because `hg add foo` was being treated as `hg add foo/**/*` even when `foo` was just a file (as opposed to a directory), in which case the glob was not matching `foo`! I also had to do some work in `eden_dirstate.status()` in which the `match` argument was previously largely ignored. It turns out that `dirstate.py` uses `status()` for a number of things with the `match` specified as a filter, so the output of `status()` must be filtered by `match` accordingly. Ultimately, this seems like work that would be better done on the server, but for simplicity, we're just going to do it in Python, for now. For the reasons explained above, this revision deletes a lot of code `Dirstate.cpp`. As such, `DirstateTest.cpp` does not seem worth refactoring, though the scenarios it was testing should probably be converted to integration tests. At a high level, the role of `DirstatePersistence` has not changed, but the exact data it writes is much different. Its corresponding unit test is also disabled, for now. Note that this revision does not change the name of the file where "dirstate data" is written (this is defined as `kDirstateFile` in `ClientConfig.cpp`), so we should blow away any existing instances of this file once this change lands. (It is still early enough in the project that it does not seem worth the overhead of a proper migration.) The true test of the success of this new approach is the ease with which we can write more integration tests for things like `hg histedit` and `hg graft`. Ideally, these should require very few changes to `eden_dirstate.py`. Reviewed By: simpkins Differential Revision: D5071778 fbshipit-source-id: e8fec4d393035d80f36516ac050cad025dc3ba31
2017-05-26 21:51:30 +03:00
from .lib.histedit_command import HisteditCommand
from ..lib import hgrepo
@hg_test
class HisteditTest(EdenHgTestCase):
def populate_backing_repo(self, repo):
repo.write_file('first', '')
self._commit1 = repo.commit('first commit')
repo.write_file('second', '')
self._commit2 = repo.commit('second commit')
repo.write_file('third', '')
self._commit3 = repo.commit('third commit')
def test_stop_at_earlier_commit_in_the_stack_without_reordering(self):
commits = self.repo.log()
self.assertEqual([self._commit1, self._commit2, self._commit3], commits)
# histedit, stopping in the middle of the stack.
Reimplement dirstate used by Eden's Hg extension as a subclass of Hg's dirstate. Summary: This is a major change to Eden's Hg extension. Our initial attempt to implement `edendirstate` was to create a "clean room" implementation that did not share code with `mercurial/dirstate.py`. This was helpful in uncovering the subset of the dirstate API that matters for Eden. It also provided a better safeguard against upstream changes to `dirstate.py` in Mercurial itself. In this implementation, the state transition management was mostly done on the server in `Dirstate.cpp`. We also made a modest attempt to make `Dirstate.cpp` "SCM-agnostic" such that the same APIs could be used for Git at some point. However, as we have tried to support more of the sophisticated functionality in Mercurial, particularly `hg histedit`, achieving parity between the clean room implementation and Mercurial's internals has become more challenging. Ultimately, the clean room implementation is likely the right way to go for Eden, but for now, we need to prioritize having feature parity with vanilla Hg when using Eden. Once we have a more complete set of integration tests in place, we can reimplement Eden's dirstate more aggressively to optimize things. Fortunately, the [[ https://bitbucket.org/facebook/hg-experimental/src/default/sqldirstate/ | sqldirstate ]] extension has already demonstrated that it is possible to provide a faithful dirstate implementation that subclasses the original `dirstate` while using a different storage mechanism. As such, I used `sqldirstate` as a model when implementing the new `eden_dirstate` (distinguishing it from our v1 implementation, `edendirstate`). In particular, `sqldirstate` uses SQL tables as storage for the following private fields of `dirstate`: `_map`, `_dirs`, `_copymap`, `_filefoldmap`, `_dirfoldmap`. Because `_filefoldmap` and `_dirfoldmap` exist to deal with case-insensitivity issues, we do not support them in `eden_dirstate` and add code to ensure the codepaths that would access them in `dirstate` never get exercised. Similarly, we also implemented `eden_dirstate` so that it never accesses `_dirs`. (`_dirs` is a multiset of all directories in the dirstate, which is an O(repo) data structure, so we do not want to maintain it in Eden. It appears to be primarily used for checking whether a path to a file already exists in the dirstate as a directory. We can protect against that in more efficient ways.) That leaves only `_map` and `_copymap` to worry about. `_copymap` contains the set of files that have been marked "copied" in the current dirstate, so it is fairly small and can be stored on disk or in memory with little concern. `_map` is a bit trickier because it is expected to have an entry for every file in the dirstate. In `sqldirstate`, it is stored across two tables: `files` and `nonnormalfiles`. For Eden, we already represent the data analogous to the `files` table in RocksDB/the overlay, so we do not need to create a new equivalent to the `files` table. We do, however, need an equivalent to the `nonnormalfiles` table, which we store in as Thrift-serialized data in an ordinary file along with the `_copymap` data. In our Hg extension, our implementation of `_map` is `eden_dirstate_map`, which is defined in a Python file of the same name. Our implementation of `_copymap` is `dummy_copymap`, which is defined in `eden_dirstate.py`. Both of these collections are simple pass-through data structures that translate their method calls to Thrift server calls. I expect we will want to optimize this in the future via some client-side caching, as well as creating batch APIs for talking to the server via Thrift. One advantage of this new implementation is that it enables us to delete `eden/hg/eden/overrides.py`, which overrode the entry points for `hg add` and `hg remove`. Between the recent implementation of `dirstate.walk()` for Eden and this switch to the real dirstate, we can now use the default implementation of `hg add` and `hg remove` (although we have to play some tricks, like in the implementation of `eden_dirstate.status()` in order to make `hg remove` work). In the course of doing this revision, I discovered that I had to make a minor fix to `EdenMatchInfo.make_glob_list()` because `hg add foo` was being treated as `hg add foo/**/*` even when `foo` was just a file (as opposed to a directory), in which case the glob was not matching `foo`! I also had to do some work in `eden_dirstate.status()` in which the `match` argument was previously largely ignored. It turns out that `dirstate.py` uses `status()` for a number of things with the `match` specified as a filter, so the output of `status()` must be filtered by `match` accordingly. Ultimately, this seems like work that would be better done on the server, but for simplicity, we're just going to do it in Python, for now. For the reasons explained above, this revision deletes a lot of code `Dirstate.cpp`. As such, `DirstateTest.cpp` does not seem worth refactoring, though the scenarios it was testing should probably be converted to integration tests. At a high level, the role of `DirstatePersistence` has not changed, but the exact data it writes is much different. Its corresponding unit test is also disabled, for now. Note that this revision does not change the name of the file where "dirstate data" is written (this is defined as `kDirstateFile` in `ClientConfig.cpp`), so we should blow away any existing instances of this file once this change lands. (It is still early enough in the project that it does not seem worth the overhead of a proper migration.) The true test of the success of this new approach is the ease with which we can write more integration tests for things like `hg histedit` and `hg graft`. Ideally, these should require very few changes to `eden_dirstate.py`. Reviewed By: simpkins Differential Revision: D5071778 fbshipit-source-id: e8fec4d393035d80f36516ac050cad025dc3ba31
2017-05-26 21:51:30 +03:00
histedit = HisteditCommand()
histedit.pick(self._commit1)
histedit.stop(self._commit2)
histedit.pick(self._commit3)
# We expect histedit to terminate with a nonzero exit code in this case.
with self.assertRaises(hgrepo.HgError) as context:
Reimplement dirstate used by Eden's Hg extension as a subclass of Hg's dirstate. Summary: This is a major change to Eden's Hg extension. Our initial attempt to implement `edendirstate` was to create a "clean room" implementation that did not share code with `mercurial/dirstate.py`. This was helpful in uncovering the subset of the dirstate API that matters for Eden. It also provided a better safeguard against upstream changes to `dirstate.py` in Mercurial itself. In this implementation, the state transition management was mostly done on the server in `Dirstate.cpp`. We also made a modest attempt to make `Dirstate.cpp` "SCM-agnostic" such that the same APIs could be used for Git at some point. However, as we have tried to support more of the sophisticated functionality in Mercurial, particularly `hg histedit`, achieving parity between the clean room implementation and Mercurial's internals has become more challenging. Ultimately, the clean room implementation is likely the right way to go for Eden, but for now, we need to prioritize having feature parity with vanilla Hg when using Eden. Once we have a more complete set of integration tests in place, we can reimplement Eden's dirstate more aggressively to optimize things. Fortunately, the [[ https://bitbucket.org/facebook/hg-experimental/src/default/sqldirstate/ | sqldirstate ]] extension has already demonstrated that it is possible to provide a faithful dirstate implementation that subclasses the original `dirstate` while using a different storage mechanism. As such, I used `sqldirstate` as a model when implementing the new `eden_dirstate` (distinguishing it from our v1 implementation, `edendirstate`). In particular, `sqldirstate` uses SQL tables as storage for the following private fields of `dirstate`: `_map`, `_dirs`, `_copymap`, `_filefoldmap`, `_dirfoldmap`. Because `_filefoldmap` and `_dirfoldmap` exist to deal with case-insensitivity issues, we do not support them in `eden_dirstate` and add code to ensure the codepaths that would access them in `dirstate` never get exercised. Similarly, we also implemented `eden_dirstate` so that it never accesses `_dirs`. (`_dirs` is a multiset of all directories in the dirstate, which is an O(repo) data structure, so we do not want to maintain it in Eden. It appears to be primarily used for checking whether a path to a file already exists in the dirstate as a directory. We can protect against that in more efficient ways.) That leaves only `_map` and `_copymap` to worry about. `_copymap` contains the set of files that have been marked "copied" in the current dirstate, so it is fairly small and can be stored on disk or in memory with little concern. `_map` is a bit trickier because it is expected to have an entry for every file in the dirstate. In `sqldirstate`, it is stored across two tables: `files` and `nonnormalfiles`. For Eden, we already represent the data analogous to the `files` table in RocksDB/the overlay, so we do not need to create a new equivalent to the `files` table. We do, however, need an equivalent to the `nonnormalfiles` table, which we store in as Thrift-serialized data in an ordinary file along with the `_copymap` data. In our Hg extension, our implementation of `_map` is `eden_dirstate_map`, which is defined in a Python file of the same name. Our implementation of `_copymap` is `dummy_copymap`, which is defined in `eden_dirstate.py`. Both of these collections are simple pass-through data structures that translate their method calls to Thrift server calls. I expect we will want to optimize this in the future via some client-side caching, as well as creating batch APIs for talking to the server via Thrift. One advantage of this new implementation is that it enables us to delete `eden/hg/eden/overrides.py`, which overrode the entry points for `hg add` and `hg remove`. Between the recent implementation of `dirstate.walk()` for Eden and this switch to the real dirstate, we can now use the default implementation of `hg add` and `hg remove` (although we have to play some tricks, like in the implementation of `eden_dirstate.status()` in order to make `hg remove` work). In the course of doing this revision, I discovered that I had to make a minor fix to `EdenMatchInfo.make_glob_list()` because `hg add foo` was being treated as `hg add foo/**/*` even when `foo` was just a file (as opposed to a directory), in which case the glob was not matching `foo`! I also had to do some work in `eden_dirstate.status()` in which the `match` argument was previously largely ignored. It turns out that `dirstate.py` uses `status()` for a number of things with the `match` specified as a filter, so the output of `status()` must be filtered by `match` accordingly. Ultimately, this seems like work that would be better done on the server, but for simplicity, we're just going to do it in Python, for now. For the reasons explained above, this revision deletes a lot of code `Dirstate.cpp`. As such, `DirstateTest.cpp` does not seem worth refactoring, though the scenarios it was testing should probably be converted to integration tests. At a high level, the role of `DirstatePersistence` has not changed, but the exact data it writes is much different. Its corresponding unit test is also disabled, for now. Note that this revision does not change the name of the file where "dirstate data" is written (this is defined as `kDirstateFile` in `ClientConfig.cpp`), so we should blow away any existing instances of this file once this change lands. (It is still early enough in the project that it does not seem worth the overhead of a proper migration.) The true test of the success of this new approach is the ease with which we can write more integration tests for things like `hg histedit` and `hg graft`. Ideally, these should require very few changes to `eden_dirstate.py`. Reviewed By: simpkins Differential Revision: D5071778 fbshipit-source-id: e8fec4d393035d80f36516ac050cad025dc3ba31
2017-05-26 21:51:30 +03:00
histedit.run(self)
head = self.repo.log(revset='.')[0]
expected_msg = (
'Changes committed as %s. '
'You may amend the changeset now.' % head[:12]
)
self.assertIn(expected_msg, str(context.exception))
# Verify the new commit stack and the histedit termination state.
# Note that the hash of commit[0] is unpredictable because Hg gives it a
# new hash in anticipation of the user amending it.
parent = self.repo.log(revset='.^')[0]
self.assertEqual(self._commit1, parent)
Reimplement dirstate used by Eden's Hg extension as a subclass of Hg's dirstate. Summary: This is a major change to Eden's Hg extension. Our initial attempt to implement `edendirstate` was to create a "clean room" implementation that did not share code with `mercurial/dirstate.py`. This was helpful in uncovering the subset of the dirstate API that matters for Eden. It also provided a better safeguard against upstream changes to `dirstate.py` in Mercurial itself. In this implementation, the state transition management was mostly done on the server in `Dirstate.cpp`. We also made a modest attempt to make `Dirstate.cpp` "SCM-agnostic" such that the same APIs could be used for Git at some point. However, as we have tried to support more of the sophisticated functionality in Mercurial, particularly `hg histedit`, achieving parity between the clean room implementation and Mercurial's internals has become more challenging. Ultimately, the clean room implementation is likely the right way to go for Eden, but for now, we need to prioritize having feature parity with vanilla Hg when using Eden. Once we have a more complete set of integration tests in place, we can reimplement Eden's dirstate more aggressively to optimize things. Fortunately, the [[ https://bitbucket.org/facebook/hg-experimental/src/default/sqldirstate/ | sqldirstate ]] extension has already demonstrated that it is possible to provide a faithful dirstate implementation that subclasses the original `dirstate` while using a different storage mechanism. As such, I used `sqldirstate` as a model when implementing the new `eden_dirstate` (distinguishing it from our v1 implementation, `edendirstate`). In particular, `sqldirstate` uses SQL tables as storage for the following private fields of `dirstate`: `_map`, `_dirs`, `_copymap`, `_filefoldmap`, `_dirfoldmap`. Because `_filefoldmap` and `_dirfoldmap` exist to deal with case-insensitivity issues, we do not support them in `eden_dirstate` and add code to ensure the codepaths that would access them in `dirstate` never get exercised. Similarly, we also implemented `eden_dirstate` so that it never accesses `_dirs`. (`_dirs` is a multiset of all directories in the dirstate, which is an O(repo) data structure, so we do not want to maintain it in Eden. It appears to be primarily used for checking whether a path to a file already exists in the dirstate as a directory. We can protect against that in more efficient ways.) That leaves only `_map` and `_copymap` to worry about. `_copymap` contains the set of files that have been marked "copied" in the current dirstate, so it is fairly small and can be stored on disk or in memory with little concern. `_map` is a bit trickier because it is expected to have an entry for every file in the dirstate. In `sqldirstate`, it is stored across two tables: `files` and `nonnormalfiles`. For Eden, we already represent the data analogous to the `files` table in RocksDB/the overlay, so we do not need to create a new equivalent to the `files` table. We do, however, need an equivalent to the `nonnormalfiles` table, which we store in as Thrift-serialized data in an ordinary file along with the `_copymap` data. In our Hg extension, our implementation of `_map` is `eden_dirstate_map`, which is defined in a Python file of the same name. Our implementation of `_copymap` is `dummy_copymap`, which is defined in `eden_dirstate.py`. Both of these collections are simple pass-through data structures that translate their method calls to Thrift server calls. I expect we will want to optimize this in the future via some client-side caching, as well as creating batch APIs for talking to the server via Thrift. One advantage of this new implementation is that it enables us to delete `eden/hg/eden/overrides.py`, which overrode the entry points for `hg add` and `hg remove`. Between the recent implementation of `dirstate.walk()` for Eden and this switch to the real dirstate, we can now use the default implementation of `hg add` and `hg remove` (although we have to play some tricks, like in the implementation of `eden_dirstate.status()` in order to make `hg remove` work). In the course of doing this revision, I discovered that I had to make a minor fix to `EdenMatchInfo.make_glob_list()` because `hg add foo` was being treated as `hg add foo/**/*` even when `foo` was just a file (as opposed to a directory), in which case the glob was not matching `foo`! I also had to do some work in `eden_dirstate.status()` in which the `match` argument was previously largely ignored. It turns out that `dirstate.py` uses `status()` for a number of things with the `match` specified as a filter, so the output of `status()` must be filtered by `match` accordingly. Ultimately, this seems like work that would be better done on the server, but for simplicity, we're just going to do it in Python, for now. For the reasons explained above, this revision deletes a lot of code `Dirstate.cpp`. As such, `DirstateTest.cpp` does not seem worth refactoring, though the scenarios it was testing should probably be converted to integration tests. At a high level, the role of `DirstatePersistence` has not changed, but the exact data it writes is much different. Its corresponding unit test is also disabled, for now. Note that this revision does not change the name of the file where "dirstate data" is written (this is defined as `kDirstateFile` in `ClientConfig.cpp`), so we should blow away any existing instances of this file once this change lands. (It is still early enough in the project that it does not seem worth the overhead of a proper migration.) The true test of the success of this new approach is the ease with which we can write more integration tests for things like `hg histedit` and `hg graft`. Ideally, these should require very few changes to `eden_dirstate.py`. Reviewed By: simpkins Differential Revision: D5071778 fbshipit-source-id: e8fec4d393035d80f36516ac050cad025dc3ba31
2017-05-26 21:51:30 +03:00
self.assertEqual(
['first commit', 'second commit'], self.repo.log('{desc}')
Reimplement dirstate used by Eden's Hg extension as a subclass of Hg's dirstate. Summary: This is a major change to Eden's Hg extension. Our initial attempt to implement `edendirstate` was to create a "clean room" implementation that did not share code with `mercurial/dirstate.py`. This was helpful in uncovering the subset of the dirstate API that matters for Eden. It also provided a better safeguard against upstream changes to `dirstate.py` in Mercurial itself. In this implementation, the state transition management was mostly done on the server in `Dirstate.cpp`. We also made a modest attempt to make `Dirstate.cpp` "SCM-agnostic" such that the same APIs could be used for Git at some point. However, as we have tried to support more of the sophisticated functionality in Mercurial, particularly `hg histedit`, achieving parity between the clean room implementation and Mercurial's internals has become more challenging. Ultimately, the clean room implementation is likely the right way to go for Eden, but for now, we need to prioritize having feature parity with vanilla Hg when using Eden. Once we have a more complete set of integration tests in place, we can reimplement Eden's dirstate more aggressively to optimize things. Fortunately, the [[ https://bitbucket.org/facebook/hg-experimental/src/default/sqldirstate/ | sqldirstate ]] extension has already demonstrated that it is possible to provide a faithful dirstate implementation that subclasses the original `dirstate` while using a different storage mechanism. As such, I used `sqldirstate` as a model when implementing the new `eden_dirstate` (distinguishing it from our v1 implementation, `edendirstate`). In particular, `sqldirstate` uses SQL tables as storage for the following private fields of `dirstate`: `_map`, `_dirs`, `_copymap`, `_filefoldmap`, `_dirfoldmap`. Because `_filefoldmap` and `_dirfoldmap` exist to deal with case-insensitivity issues, we do not support them in `eden_dirstate` and add code to ensure the codepaths that would access them in `dirstate` never get exercised. Similarly, we also implemented `eden_dirstate` so that it never accesses `_dirs`. (`_dirs` is a multiset of all directories in the dirstate, which is an O(repo) data structure, so we do not want to maintain it in Eden. It appears to be primarily used for checking whether a path to a file already exists in the dirstate as a directory. We can protect against that in more efficient ways.) That leaves only `_map` and `_copymap` to worry about. `_copymap` contains the set of files that have been marked "copied" in the current dirstate, so it is fairly small and can be stored on disk or in memory with little concern. `_map` is a bit trickier because it is expected to have an entry for every file in the dirstate. In `sqldirstate`, it is stored across two tables: `files` and `nonnormalfiles`. For Eden, we already represent the data analogous to the `files` table in RocksDB/the overlay, so we do not need to create a new equivalent to the `files` table. We do, however, need an equivalent to the `nonnormalfiles` table, which we store in as Thrift-serialized data in an ordinary file along with the `_copymap` data. In our Hg extension, our implementation of `_map` is `eden_dirstate_map`, which is defined in a Python file of the same name. Our implementation of `_copymap` is `dummy_copymap`, which is defined in `eden_dirstate.py`. Both of these collections are simple pass-through data structures that translate their method calls to Thrift server calls. I expect we will want to optimize this in the future via some client-side caching, as well as creating batch APIs for talking to the server via Thrift. One advantage of this new implementation is that it enables us to delete `eden/hg/eden/overrides.py`, which overrode the entry points for `hg add` and `hg remove`. Between the recent implementation of `dirstate.walk()` for Eden and this switch to the real dirstate, we can now use the default implementation of `hg add` and `hg remove` (although we have to play some tricks, like in the implementation of `eden_dirstate.status()` in order to make `hg remove` work). In the course of doing this revision, I discovered that I had to make a minor fix to `EdenMatchInfo.make_glob_list()` because `hg add foo` was being treated as `hg add foo/**/*` even when `foo` was just a file (as opposed to a directory), in which case the glob was not matching `foo`! I also had to do some work in `eden_dirstate.status()` in which the `match` argument was previously largely ignored. It turns out that `dirstate.py` uses `status()` for a number of things with the `match` specified as a filter, so the output of `status()` must be filtered by `match` accordingly. Ultimately, this seems like work that would be better done on the server, but for simplicity, we're just going to do it in Python, for now. For the reasons explained above, this revision deletes a lot of code `Dirstate.cpp`. As such, `DirstateTest.cpp` does not seem worth refactoring, though the scenarios it was testing should probably be converted to integration tests. At a high level, the role of `DirstatePersistence` has not changed, but the exact data it writes is much different. Its corresponding unit test is also disabled, for now. Note that this revision does not change the name of the file where "dirstate data" is written (this is defined as `kDirstateFile` in `ClientConfig.cpp`), so we should blow away any existing instances of this file once this change lands. (It is still early enough in the project that it does not seem worth the overhead of a proper migration.) The true test of the success of this new approach is the ease with which we can write more integration tests for things like `hg histedit` and `hg graft`. Ideally, these should require very few changes to `eden_dirstate.py`. Reviewed By: simpkins Differential Revision: D5071778 fbshipit-source-id: e8fec4d393035d80f36516ac050cad025dc3ba31
2017-05-26 21:51:30 +03:00
)
# Make sure the working copy is in the expected state.
self.assert_status_empty()
Reimplement dirstate used by Eden's Hg extension as a subclass of Hg's dirstate. Summary: This is a major change to Eden's Hg extension. Our initial attempt to implement `edendirstate` was to create a "clean room" implementation that did not share code with `mercurial/dirstate.py`. This was helpful in uncovering the subset of the dirstate API that matters for Eden. It also provided a better safeguard against upstream changes to `dirstate.py` in Mercurial itself. In this implementation, the state transition management was mostly done on the server in `Dirstate.cpp`. We also made a modest attempt to make `Dirstate.cpp` "SCM-agnostic" such that the same APIs could be used for Git at some point. However, as we have tried to support more of the sophisticated functionality in Mercurial, particularly `hg histedit`, achieving parity between the clean room implementation and Mercurial's internals has become more challenging. Ultimately, the clean room implementation is likely the right way to go for Eden, but for now, we need to prioritize having feature parity with vanilla Hg when using Eden. Once we have a more complete set of integration tests in place, we can reimplement Eden's dirstate more aggressively to optimize things. Fortunately, the [[ https://bitbucket.org/facebook/hg-experimental/src/default/sqldirstate/ | sqldirstate ]] extension has already demonstrated that it is possible to provide a faithful dirstate implementation that subclasses the original `dirstate` while using a different storage mechanism. As such, I used `sqldirstate` as a model when implementing the new `eden_dirstate` (distinguishing it from our v1 implementation, `edendirstate`). In particular, `sqldirstate` uses SQL tables as storage for the following private fields of `dirstate`: `_map`, `_dirs`, `_copymap`, `_filefoldmap`, `_dirfoldmap`. Because `_filefoldmap` and `_dirfoldmap` exist to deal with case-insensitivity issues, we do not support them in `eden_dirstate` and add code to ensure the codepaths that would access them in `dirstate` never get exercised. Similarly, we also implemented `eden_dirstate` so that it never accesses `_dirs`. (`_dirs` is a multiset of all directories in the dirstate, which is an O(repo) data structure, so we do not want to maintain it in Eden. It appears to be primarily used for checking whether a path to a file already exists in the dirstate as a directory. We can protect against that in more efficient ways.) That leaves only `_map` and `_copymap` to worry about. `_copymap` contains the set of files that have been marked "copied" in the current dirstate, so it is fairly small and can be stored on disk or in memory with little concern. `_map` is a bit trickier because it is expected to have an entry for every file in the dirstate. In `sqldirstate`, it is stored across two tables: `files` and `nonnormalfiles`. For Eden, we already represent the data analogous to the `files` table in RocksDB/the overlay, so we do not need to create a new equivalent to the `files` table. We do, however, need an equivalent to the `nonnormalfiles` table, which we store in as Thrift-serialized data in an ordinary file along with the `_copymap` data. In our Hg extension, our implementation of `_map` is `eden_dirstate_map`, which is defined in a Python file of the same name. Our implementation of `_copymap` is `dummy_copymap`, which is defined in `eden_dirstate.py`. Both of these collections are simple pass-through data structures that translate their method calls to Thrift server calls. I expect we will want to optimize this in the future via some client-side caching, as well as creating batch APIs for talking to the server via Thrift. One advantage of this new implementation is that it enables us to delete `eden/hg/eden/overrides.py`, which overrode the entry points for `hg add` and `hg remove`. Between the recent implementation of `dirstate.walk()` for Eden and this switch to the real dirstate, we can now use the default implementation of `hg add` and `hg remove` (although we have to play some tricks, like in the implementation of `eden_dirstate.status()` in order to make `hg remove` work). In the course of doing this revision, I discovered that I had to make a minor fix to `EdenMatchInfo.make_glob_list()` because `hg add foo` was being treated as `hg add foo/**/*` even when `foo` was just a file (as opposed to a directory), in which case the glob was not matching `foo`! I also had to do some work in `eden_dirstate.status()` in which the `match` argument was previously largely ignored. It turns out that `dirstate.py` uses `status()` for a number of things with the `match` specified as a filter, so the output of `status()` must be filtered by `match` accordingly. Ultimately, this seems like work that would be better done on the server, but for simplicity, we're just going to do it in Python, for now. For the reasons explained above, this revision deletes a lot of code `Dirstate.cpp`. As such, `DirstateTest.cpp` does not seem worth refactoring, though the scenarios it was testing should probably be converted to integration tests. At a high level, the role of `DirstatePersistence` has not changed, but the exact data it writes is much different. Its corresponding unit test is also disabled, for now. Note that this revision does not change the name of the file where "dirstate data" is written (this is defined as `kDirstateFile` in `ClientConfig.cpp`), so we should blow away any existing instances of this file once this change lands. (It is still early enough in the project that it does not seem worth the overhead of a proper migration.) The true test of the success of this new approach is the ease with which we can write more integration tests for things like `hg histedit` and `hg graft`. Ideally, these should require very few changes to `eden_dirstate.py`. Reviewed By: simpkins Differential Revision: D5071778 fbshipit-source-id: e8fec4d393035d80f36516ac050cad025dc3ba31
2017-05-26 21:51:30 +03:00
self.assertSetEqual(
set(['.eden', '.hg', 'first', 'second']),
set(os.listdir(self.repo.get_canonical_root()))
)
self.hg('histedit', '--continue')
self.assertEqual(
['first commit', 'second commit', 'third commit'],
self.repo.log('{desc}')
Reimplement dirstate used by Eden's Hg extension as a subclass of Hg's dirstate. Summary: This is a major change to Eden's Hg extension. Our initial attempt to implement `edendirstate` was to create a "clean room" implementation that did not share code with `mercurial/dirstate.py`. This was helpful in uncovering the subset of the dirstate API that matters for Eden. It also provided a better safeguard against upstream changes to `dirstate.py` in Mercurial itself. In this implementation, the state transition management was mostly done on the server in `Dirstate.cpp`. We also made a modest attempt to make `Dirstate.cpp` "SCM-agnostic" such that the same APIs could be used for Git at some point. However, as we have tried to support more of the sophisticated functionality in Mercurial, particularly `hg histedit`, achieving parity between the clean room implementation and Mercurial's internals has become more challenging. Ultimately, the clean room implementation is likely the right way to go for Eden, but for now, we need to prioritize having feature parity with vanilla Hg when using Eden. Once we have a more complete set of integration tests in place, we can reimplement Eden's dirstate more aggressively to optimize things. Fortunately, the [[ https://bitbucket.org/facebook/hg-experimental/src/default/sqldirstate/ | sqldirstate ]] extension has already demonstrated that it is possible to provide a faithful dirstate implementation that subclasses the original `dirstate` while using a different storage mechanism. As such, I used `sqldirstate` as a model when implementing the new `eden_dirstate` (distinguishing it from our v1 implementation, `edendirstate`). In particular, `sqldirstate` uses SQL tables as storage for the following private fields of `dirstate`: `_map`, `_dirs`, `_copymap`, `_filefoldmap`, `_dirfoldmap`. Because `_filefoldmap` and `_dirfoldmap` exist to deal with case-insensitivity issues, we do not support them in `eden_dirstate` and add code to ensure the codepaths that would access them in `dirstate` never get exercised. Similarly, we also implemented `eden_dirstate` so that it never accesses `_dirs`. (`_dirs` is a multiset of all directories in the dirstate, which is an O(repo) data structure, so we do not want to maintain it in Eden. It appears to be primarily used for checking whether a path to a file already exists in the dirstate as a directory. We can protect against that in more efficient ways.) That leaves only `_map` and `_copymap` to worry about. `_copymap` contains the set of files that have been marked "copied" in the current dirstate, so it is fairly small and can be stored on disk or in memory with little concern. `_map` is a bit trickier because it is expected to have an entry for every file in the dirstate. In `sqldirstate`, it is stored across two tables: `files` and `nonnormalfiles`. For Eden, we already represent the data analogous to the `files` table in RocksDB/the overlay, so we do not need to create a new equivalent to the `files` table. We do, however, need an equivalent to the `nonnormalfiles` table, which we store in as Thrift-serialized data in an ordinary file along with the `_copymap` data. In our Hg extension, our implementation of `_map` is `eden_dirstate_map`, which is defined in a Python file of the same name. Our implementation of `_copymap` is `dummy_copymap`, which is defined in `eden_dirstate.py`. Both of these collections are simple pass-through data structures that translate their method calls to Thrift server calls. I expect we will want to optimize this in the future via some client-side caching, as well as creating batch APIs for talking to the server via Thrift. One advantage of this new implementation is that it enables us to delete `eden/hg/eden/overrides.py`, which overrode the entry points for `hg add` and `hg remove`. Between the recent implementation of `dirstate.walk()` for Eden and this switch to the real dirstate, we can now use the default implementation of `hg add` and `hg remove` (although we have to play some tricks, like in the implementation of `eden_dirstate.status()` in order to make `hg remove` work). In the course of doing this revision, I discovered that I had to make a minor fix to `EdenMatchInfo.make_glob_list()` because `hg add foo` was being treated as `hg add foo/**/*` even when `foo` was just a file (as opposed to a directory), in which case the glob was not matching `foo`! I also had to do some work in `eden_dirstate.status()` in which the `match` argument was previously largely ignored. It turns out that `dirstate.py` uses `status()` for a number of things with the `match` specified as a filter, so the output of `status()` must be filtered by `match` accordingly. Ultimately, this seems like work that would be better done on the server, but for simplicity, we're just going to do it in Python, for now. For the reasons explained above, this revision deletes a lot of code `Dirstate.cpp`. As such, `DirstateTest.cpp` does not seem worth refactoring, though the scenarios it was testing should probably be converted to integration tests. At a high level, the role of `DirstatePersistence` has not changed, but the exact data it writes is much different. Its corresponding unit test is also disabled, for now. Note that this revision does not change the name of the file where "dirstate data" is written (this is defined as `kDirstateFile` in `ClientConfig.cpp`), so we should blow away any existing instances of this file once this change lands. (It is still early enough in the project that it does not seem worth the overhead of a proper migration.) The true test of the success of this new approach is the ease with which we can write more integration tests for things like `hg histedit` and `hg graft`. Ideally, these should require very few changes to `eden_dirstate.py`. Reviewed By: simpkins Differential Revision: D5071778 fbshipit-source-id: e8fec4d393035d80f36516ac050cad025dc3ba31
2017-05-26 21:51:30 +03:00
)
self.assert_status_empty()
self.assertSetEqual(
set(['.eden', '.hg', 'first', 'second', 'third']),
set(os.listdir(self.repo.get_canonical_root()))
)
def test_reordering_commits_without_merge_conflicts(self):
self.assertEqual(
['first commit', 'second commit', 'third commit'],
self.repo.log('{desc}')
Reimplement dirstate used by Eden's Hg extension as a subclass of Hg's dirstate. Summary: This is a major change to Eden's Hg extension. Our initial attempt to implement `edendirstate` was to create a "clean room" implementation that did not share code with `mercurial/dirstate.py`. This was helpful in uncovering the subset of the dirstate API that matters for Eden. It also provided a better safeguard against upstream changes to `dirstate.py` in Mercurial itself. In this implementation, the state transition management was mostly done on the server in `Dirstate.cpp`. We also made a modest attempt to make `Dirstate.cpp` "SCM-agnostic" such that the same APIs could be used for Git at some point. However, as we have tried to support more of the sophisticated functionality in Mercurial, particularly `hg histedit`, achieving parity between the clean room implementation and Mercurial's internals has become more challenging. Ultimately, the clean room implementation is likely the right way to go for Eden, but for now, we need to prioritize having feature parity with vanilla Hg when using Eden. Once we have a more complete set of integration tests in place, we can reimplement Eden's dirstate more aggressively to optimize things. Fortunately, the [[ https://bitbucket.org/facebook/hg-experimental/src/default/sqldirstate/ | sqldirstate ]] extension has already demonstrated that it is possible to provide a faithful dirstate implementation that subclasses the original `dirstate` while using a different storage mechanism. As such, I used `sqldirstate` as a model when implementing the new `eden_dirstate` (distinguishing it from our v1 implementation, `edendirstate`). In particular, `sqldirstate` uses SQL tables as storage for the following private fields of `dirstate`: `_map`, `_dirs`, `_copymap`, `_filefoldmap`, `_dirfoldmap`. Because `_filefoldmap` and `_dirfoldmap` exist to deal with case-insensitivity issues, we do not support them in `eden_dirstate` and add code to ensure the codepaths that would access them in `dirstate` never get exercised. Similarly, we also implemented `eden_dirstate` so that it never accesses `_dirs`. (`_dirs` is a multiset of all directories in the dirstate, which is an O(repo) data structure, so we do not want to maintain it in Eden. It appears to be primarily used for checking whether a path to a file already exists in the dirstate as a directory. We can protect against that in more efficient ways.) That leaves only `_map` and `_copymap` to worry about. `_copymap` contains the set of files that have been marked "copied" in the current dirstate, so it is fairly small and can be stored on disk or in memory with little concern. `_map` is a bit trickier because it is expected to have an entry for every file in the dirstate. In `sqldirstate`, it is stored across two tables: `files` and `nonnormalfiles`. For Eden, we already represent the data analogous to the `files` table in RocksDB/the overlay, so we do not need to create a new equivalent to the `files` table. We do, however, need an equivalent to the `nonnormalfiles` table, which we store in as Thrift-serialized data in an ordinary file along with the `_copymap` data. In our Hg extension, our implementation of `_map` is `eden_dirstate_map`, which is defined in a Python file of the same name. Our implementation of `_copymap` is `dummy_copymap`, which is defined in `eden_dirstate.py`. Both of these collections are simple pass-through data structures that translate their method calls to Thrift server calls. I expect we will want to optimize this in the future via some client-side caching, as well as creating batch APIs for talking to the server via Thrift. One advantage of this new implementation is that it enables us to delete `eden/hg/eden/overrides.py`, which overrode the entry points for `hg add` and `hg remove`. Between the recent implementation of `dirstate.walk()` for Eden and this switch to the real dirstate, we can now use the default implementation of `hg add` and `hg remove` (although we have to play some tricks, like in the implementation of `eden_dirstate.status()` in order to make `hg remove` work). In the course of doing this revision, I discovered that I had to make a minor fix to `EdenMatchInfo.make_glob_list()` because `hg add foo` was being treated as `hg add foo/**/*` even when `foo` was just a file (as opposed to a directory), in which case the glob was not matching `foo`! I also had to do some work in `eden_dirstate.status()` in which the `match` argument was previously largely ignored. It turns out that `dirstate.py` uses `status()` for a number of things with the `match` specified as a filter, so the output of `status()` must be filtered by `match` accordingly. Ultimately, this seems like work that would be better done on the server, but for simplicity, we're just going to do it in Python, for now. For the reasons explained above, this revision deletes a lot of code `Dirstate.cpp`. As such, `DirstateTest.cpp` does not seem worth refactoring, though the scenarios it was testing should probably be converted to integration tests. At a high level, the role of `DirstatePersistence` has not changed, but the exact data it writes is much different. Its corresponding unit test is also disabled, for now. Note that this revision does not change the name of the file where "dirstate data" is written (this is defined as `kDirstateFile` in `ClientConfig.cpp`), so we should blow away any existing instances of this file once this change lands. (It is still early enough in the project that it does not seem worth the overhead of a proper migration.) The true test of the success of this new approach is the ease with which we can write more integration tests for things like `hg histedit` and `hg graft`. Ideally, these should require very few changes to `eden_dirstate.py`. Reviewed By: simpkins Differential Revision: D5071778 fbshipit-source-id: e8fec4d393035d80f36516ac050cad025dc3ba31
2017-05-26 21:51:30 +03:00
)
# histedit, reordering the stack in a conflict-free way.
histedit = HisteditCommand()
histedit.pick(self._commit2)
histedit.pick(self._commit3)
histedit.pick(self._commit1)
histedit.run(self)
self.assertEqual(
['second commit', 'third commit', 'first commit'],
self.repo.log('{desc}')
Reimplement dirstate used by Eden's Hg extension as a subclass of Hg's dirstate. Summary: This is a major change to Eden's Hg extension. Our initial attempt to implement `edendirstate` was to create a "clean room" implementation that did not share code with `mercurial/dirstate.py`. This was helpful in uncovering the subset of the dirstate API that matters for Eden. It also provided a better safeguard against upstream changes to `dirstate.py` in Mercurial itself. In this implementation, the state transition management was mostly done on the server in `Dirstate.cpp`. We also made a modest attempt to make `Dirstate.cpp` "SCM-agnostic" such that the same APIs could be used for Git at some point. However, as we have tried to support more of the sophisticated functionality in Mercurial, particularly `hg histedit`, achieving parity between the clean room implementation and Mercurial's internals has become more challenging. Ultimately, the clean room implementation is likely the right way to go for Eden, but for now, we need to prioritize having feature parity with vanilla Hg when using Eden. Once we have a more complete set of integration tests in place, we can reimplement Eden's dirstate more aggressively to optimize things. Fortunately, the [[ https://bitbucket.org/facebook/hg-experimental/src/default/sqldirstate/ | sqldirstate ]] extension has already demonstrated that it is possible to provide a faithful dirstate implementation that subclasses the original `dirstate` while using a different storage mechanism. As such, I used `sqldirstate` as a model when implementing the new `eden_dirstate` (distinguishing it from our v1 implementation, `edendirstate`). In particular, `sqldirstate` uses SQL tables as storage for the following private fields of `dirstate`: `_map`, `_dirs`, `_copymap`, `_filefoldmap`, `_dirfoldmap`. Because `_filefoldmap` and `_dirfoldmap` exist to deal with case-insensitivity issues, we do not support them in `eden_dirstate` and add code to ensure the codepaths that would access them in `dirstate` never get exercised. Similarly, we also implemented `eden_dirstate` so that it never accesses `_dirs`. (`_dirs` is a multiset of all directories in the dirstate, which is an O(repo) data structure, so we do not want to maintain it in Eden. It appears to be primarily used for checking whether a path to a file already exists in the dirstate as a directory. We can protect against that in more efficient ways.) That leaves only `_map` and `_copymap` to worry about. `_copymap` contains the set of files that have been marked "copied" in the current dirstate, so it is fairly small and can be stored on disk or in memory with little concern. `_map` is a bit trickier because it is expected to have an entry for every file in the dirstate. In `sqldirstate`, it is stored across two tables: `files` and `nonnormalfiles`. For Eden, we already represent the data analogous to the `files` table in RocksDB/the overlay, so we do not need to create a new equivalent to the `files` table. We do, however, need an equivalent to the `nonnormalfiles` table, which we store in as Thrift-serialized data in an ordinary file along with the `_copymap` data. In our Hg extension, our implementation of `_map` is `eden_dirstate_map`, which is defined in a Python file of the same name. Our implementation of `_copymap` is `dummy_copymap`, which is defined in `eden_dirstate.py`. Both of these collections are simple pass-through data structures that translate their method calls to Thrift server calls. I expect we will want to optimize this in the future via some client-side caching, as well as creating batch APIs for talking to the server via Thrift. One advantage of this new implementation is that it enables us to delete `eden/hg/eden/overrides.py`, which overrode the entry points for `hg add` and `hg remove`. Between the recent implementation of `dirstate.walk()` for Eden and this switch to the real dirstate, we can now use the default implementation of `hg add` and `hg remove` (although we have to play some tricks, like in the implementation of `eden_dirstate.status()` in order to make `hg remove` work). In the course of doing this revision, I discovered that I had to make a minor fix to `EdenMatchInfo.make_glob_list()` because `hg add foo` was being treated as `hg add foo/**/*` even when `foo` was just a file (as opposed to a directory), in which case the glob was not matching `foo`! I also had to do some work in `eden_dirstate.status()` in which the `match` argument was previously largely ignored. It turns out that `dirstate.py` uses `status()` for a number of things with the `match` specified as a filter, so the output of `status()` must be filtered by `match` accordingly. Ultimately, this seems like work that would be better done on the server, but for simplicity, we're just going to do it in Python, for now. For the reasons explained above, this revision deletes a lot of code `Dirstate.cpp`. As such, `DirstateTest.cpp` does not seem worth refactoring, though the scenarios it was testing should probably be converted to integration tests. At a high level, the role of `DirstatePersistence` has not changed, but the exact data it writes is much different. Its corresponding unit test is also disabled, for now. Note that this revision does not change the name of the file where "dirstate data" is written (this is defined as `kDirstateFile` in `ClientConfig.cpp`), so we should blow away any existing instances of this file once this change lands. (It is still early enough in the project that it does not seem worth the overhead of a proper migration.) The true test of the success of this new approach is the ease with which we can write more integration tests for things like `hg histedit` and `hg graft`. Ideally, these should require very few changes to `eden_dirstate.py`. Reviewed By: simpkins Differential Revision: D5071778 fbshipit-source-id: e8fec4d393035d80f36516ac050cad025dc3ba31
2017-05-26 21:51:30 +03:00
)
self.assert_status_empty()
self.assertSetEqual(
set(['.eden', '.hg', 'first', 'second', 'third']),
set(os.listdir(self.repo.get_canonical_root()))
)
def test_drop_commit_without_merge_conflicts(self):
self.assertEqual(
['first commit', 'second commit', 'third commit'],
self.repo.log('{desc}')
Reimplement dirstate used by Eden's Hg extension as a subclass of Hg's dirstate. Summary: This is a major change to Eden's Hg extension. Our initial attempt to implement `edendirstate` was to create a "clean room" implementation that did not share code with `mercurial/dirstate.py`. This was helpful in uncovering the subset of the dirstate API that matters for Eden. It also provided a better safeguard against upstream changes to `dirstate.py` in Mercurial itself. In this implementation, the state transition management was mostly done on the server in `Dirstate.cpp`. We also made a modest attempt to make `Dirstate.cpp` "SCM-agnostic" such that the same APIs could be used for Git at some point. However, as we have tried to support more of the sophisticated functionality in Mercurial, particularly `hg histedit`, achieving parity between the clean room implementation and Mercurial's internals has become more challenging. Ultimately, the clean room implementation is likely the right way to go for Eden, but for now, we need to prioritize having feature parity with vanilla Hg when using Eden. Once we have a more complete set of integration tests in place, we can reimplement Eden's dirstate more aggressively to optimize things. Fortunately, the [[ https://bitbucket.org/facebook/hg-experimental/src/default/sqldirstate/ | sqldirstate ]] extension has already demonstrated that it is possible to provide a faithful dirstate implementation that subclasses the original `dirstate` while using a different storage mechanism. As such, I used `sqldirstate` as a model when implementing the new `eden_dirstate` (distinguishing it from our v1 implementation, `edendirstate`). In particular, `sqldirstate` uses SQL tables as storage for the following private fields of `dirstate`: `_map`, `_dirs`, `_copymap`, `_filefoldmap`, `_dirfoldmap`. Because `_filefoldmap` and `_dirfoldmap` exist to deal with case-insensitivity issues, we do not support them in `eden_dirstate` and add code to ensure the codepaths that would access them in `dirstate` never get exercised. Similarly, we also implemented `eden_dirstate` so that it never accesses `_dirs`. (`_dirs` is a multiset of all directories in the dirstate, which is an O(repo) data structure, so we do not want to maintain it in Eden. It appears to be primarily used for checking whether a path to a file already exists in the dirstate as a directory. We can protect against that in more efficient ways.) That leaves only `_map` and `_copymap` to worry about. `_copymap` contains the set of files that have been marked "copied" in the current dirstate, so it is fairly small and can be stored on disk or in memory with little concern. `_map` is a bit trickier because it is expected to have an entry for every file in the dirstate. In `sqldirstate`, it is stored across two tables: `files` and `nonnormalfiles`. For Eden, we already represent the data analogous to the `files` table in RocksDB/the overlay, so we do not need to create a new equivalent to the `files` table. We do, however, need an equivalent to the `nonnormalfiles` table, which we store in as Thrift-serialized data in an ordinary file along with the `_copymap` data. In our Hg extension, our implementation of `_map` is `eden_dirstate_map`, which is defined in a Python file of the same name. Our implementation of `_copymap` is `dummy_copymap`, which is defined in `eden_dirstate.py`. Both of these collections are simple pass-through data structures that translate their method calls to Thrift server calls. I expect we will want to optimize this in the future via some client-side caching, as well as creating batch APIs for talking to the server via Thrift. One advantage of this new implementation is that it enables us to delete `eden/hg/eden/overrides.py`, which overrode the entry points for `hg add` and `hg remove`. Between the recent implementation of `dirstate.walk()` for Eden and this switch to the real dirstate, we can now use the default implementation of `hg add` and `hg remove` (although we have to play some tricks, like in the implementation of `eden_dirstate.status()` in order to make `hg remove` work). In the course of doing this revision, I discovered that I had to make a minor fix to `EdenMatchInfo.make_glob_list()` because `hg add foo` was being treated as `hg add foo/**/*` even when `foo` was just a file (as opposed to a directory), in which case the glob was not matching `foo`! I also had to do some work in `eden_dirstate.status()` in which the `match` argument was previously largely ignored. It turns out that `dirstate.py` uses `status()` for a number of things with the `match` specified as a filter, so the output of `status()` must be filtered by `match` accordingly. Ultimately, this seems like work that would be better done on the server, but for simplicity, we're just going to do it in Python, for now. For the reasons explained above, this revision deletes a lot of code `Dirstate.cpp`. As such, `DirstateTest.cpp` does not seem worth refactoring, though the scenarios it was testing should probably be converted to integration tests. At a high level, the role of `DirstatePersistence` has not changed, but the exact data it writes is much different. Its corresponding unit test is also disabled, for now. Note that this revision does not change the name of the file where "dirstate data" is written (this is defined as `kDirstateFile` in `ClientConfig.cpp`), so we should blow away any existing instances of this file once this change lands. (It is still early enough in the project that it does not seem worth the overhead of a proper migration.) The true test of the success of this new approach is the ease with which we can write more integration tests for things like `hg histedit` and `hg graft`. Ideally, these should require very few changes to `eden_dirstate.py`. Reviewed By: simpkins Differential Revision: D5071778 fbshipit-source-id: e8fec4d393035d80f36516ac050cad025dc3ba31
2017-05-26 21:51:30 +03:00
)
# histedit, reordering the stack in a conflict-free way.
histedit = HisteditCommand()
histedit.pick(self._commit1)
histedit.drop(self._commit2)
histedit.pick(self._commit3)
histedit.run(self)
self.assertEqual(
['first commit', 'third commit'], self.repo.log('{desc}')
Reimplement dirstate used by Eden's Hg extension as a subclass of Hg's dirstate. Summary: This is a major change to Eden's Hg extension. Our initial attempt to implement `edendirstate` was to create a "clean room" implementation that did not share code with `mercurial/dirstate.py`. This was helpful in uncovering the subset of the dirstate API that matters for Eden. It also provided a better safeguard against upstream changes to `dirstate.py` in Mercurial itself. In this implementation, the state transition management was mostly done on the server in `Dirstate.cpp`. We also made a modest attempt to make `Dirstate.cpp` "SCM-agnostic" such that the same APIs could be used for Git at some point. However, as we have tried to support more of the sophisticated functionality in Mercurial, particularly `hg histedit`, achieving parity between the clean room implementation and Mercurial's internals has become more challenging. Ultimately, the clean room implementation is likely the right way to go for Eden, but for now, we need to prioritize having feature parity with vanilla Hg when using Eden. Once we have a more complete set of integration tests in place, we can reimplement Eden's dirstate more aggressively to optimize things. Fortunately, the [[ https://bitbucket.org/facebook/hg-experimental/src/default/sqldirstate/ | sqldirstate ]] extension has already demonstrated that it is possible to provide a faithful dirstate implementation that subclasses the original `dirstate` while using a different storage mechanism. As such, I used `sqldirstate` as a model when implementing the new `eden_dirstate` (distinguishing it from our v1 implementation, `edendirstate`). In particular, `sqldirstate` uses SQL tables as storage for the following private fields of `dirstate`: `_map`, `_dirs`, `_copymap`, `_filefoldmap`, `_dirfoldmap`. Because `_filefoldmap` and `_dirfoldmap` exist to deal with case-insensitivity issues, we do not support them in `eden_dirstate` and add code to ensure the codepaths that would access them in `dirstate` never get exercised. Similarly, we also implemented `eden_dirstate` so that it never accesses `_dirs`. (`_dirs` is a multiset of all directories in the dirstate, which is an O(repo) data structure, so we do not want to maintain it in Eden. It appears to be primarily used for checking whether a path to a file already exists in the dirstate as a directory. We can protect against that in more efficient ways.) That leaves only `_map` and `_copymap` to worry about. `_copymap` contains the set of files that have been marked "copied" in the current dirstate, so it is fairly small and can be stored on disk or in memory with little concern. `_map` is a bit trickier because it is expected to have an entry for every file in the dirstate. In `sqldirstate`, it is stored across two tables: `files` and `nonnormalfiles`. For Eden, we already represent the data analogous to the `files` table in RocksDB/the overlay, so we do not need to create a new equivalent to the `files` table. We do, however, need an equivalent to the `nonnormalfiles` table, which we store in as Thrift-serialized data in an ordinary file along with the `_copymap` data. In our Hg extension, our implementation of `_map` is `eden_dirstate_map`, which is defined in a Python file of the same name. Our implementation of `_copymap` is `dummy_copymap`, which is defined in `eden_dirstate.py`. Both of these collections are simple pass-through data structures that translate their method calls to Thrift server calls. I expect we will want to optimize this in the future via some client-side caching, as well as creating batch APIs for talking to the server via Thrift. One advantage of this new implementation is that it enables us to delete `eden/hg/eden/overrides.py`, which overrode the entry points for `hg add` and `hg remove`. Between the recent implementation of `dirstate.walk()` for Eden and this switch to the real dirstate, we can now use the default implementation of `hg add` and `hg remove` (although we have to play some tricks, like in the implementation of `eden_dirstate.status()` in order to make `hg remove` work). In the course of doing this revision, I discovered that I had to make a minor fix to `EdenMatchInfo.make_glob_list()` because `hg add foo` was being treated as `hg add foo/**/*` even when `foo` was just a file (as opposed to a directory), in which case the glob was not matching `foo`! I also had to do some work in `eden_dirstate.status()` in which the `match` argument was previously largely ignored. It turns out that `dirstate.py` uses `status()` for a number of things with the `match` specified as a filter, so the output of `status()` must be filtered by `match` accordingly. Ultimately, this seems like work that would be better done on the server, but for simplicity, we're just going to do it in Python, for now. For the reasons explained above, this revision deletes a lot of code `Dirstate.cpp`. As such, `DirstateTest.cpp` does not seem worth refactoring, though the scenarios it was testing should probably be converted to integration tests. At a high level, the role of `DirstatePersistence` has not changed, but the exact data it writes is much different. Its corresponding unit test is also disabled, for now. Note that this revision does not change the name of the file where "dirstate data" is written (this is defined as `kDirstateFile` in `ClientConfig.cpp`), so we should blow away any existing instances of this file once this change lands. (It is still early enough in the project that it does not seem worth the overhead of a proper migration.) The true test of the success of this new approach is the ease with which we can write more integration tests for things like `hg histedit` and `hg graft`. Ideally, these should require very few changes to `eden_dirstate.py`. Reviewed By: simpkins Differential Revision: D5071778 fbshipit-source-id: e8fec4d393035d80f36516ac050cad025dc3ba31
2017-05-26 21:51:30 +03:00
)
self.assert_status_empty()
self.assertSetEqual(
set(['.eden', '.hg', 'first', 'third']),
set(os.listdir(self.repo.get_canonical_root()))
)
def test_roll_two_commits_into_parent(self):
self.assertEqual(
['first commit', 'second commit', 'third commit'],
self.repo.log('{desc}')
Reimplement dirstate used by Eden's Hg extension as a subclass of Hg's dirstate. Summary: This is a major change to Eden's Hg extension. Our initial attempt to implement `edendirstate` was to create a "clean room" implementation that did not share code with `mercurial/dirstate.py`. This was helpful in uncovering the subset of the dirstate API that matters for Eden. It also provided a better safeguard against upstream changes to `dirstate.py` in Mercurial itself. In this implementation, the state transition management was mostly done on the server in `Dirstate.cpp`. We also made a modest attempt to make `Dirstate.cpp` "SCM-agnostic" such that the same APIs could be used for Git at some point. However, as we have tried to support more of the sophisticated functionality in Mercurial, particularly `hg histedit`, achieving parity between the clean room implementation and Mercurial's internals has become more challenging. Ultimately, the clean room implementation is likely the right way to go for Eden, but for now, we need to prioritize having feature parity with vanilla Hg when using Eden. Once we have a more complete set of integration tests in place, we can reimplement Eden's dirstate more aggressively to optimize things. Fortunately, the [[ https://bitbucket.org/facebook/hg-experimental/src/default/sqldirstate/ | sqldirstate ]] extension has already demonstrated that it is possible to provide a faithful dirstate implementation that subclasses the original `dirstate` while using a different storage mechanism. As such, I used `sqldirstate` as a model when implementing the new `eden_dirstate` (distinguishing it from our v1 implementation, `edendirstate`). In particular, `sqldirstate` uses SQL tables as storage for the following private fields of `dirstate`: `_map`, `_dirs`, `_copymap`, `_filefoldmap`, `_dirfoldmap`. Because `_filefoldmap` and `_dirfoldmap` exist to deal with case-insensitivity issues, we do not support them in `eden_dirstate` and add code to ensure the codepaths that would access them in `dirstate` never get exercised. Similarly, we also implemented `eden_dirstate` so that it never accesses `_dirs`. (`_dirs` is a multiset of all directories in the dirstate, which is an O(repo) data structure, so we do not want to maintain it in Eden. It appears to be primarily used for checking whether a path to a file already exists in the dirstate as a directory. We can protect against that in more efficient ways.) That leaves only `_map` and `_copymap` to worry about. `_copymap` contains the set of files that have been marked "copied" in the current dirstate, so it is fairly small and can be stored on disk or in memory with little concern. `_map` is a bit trickier because it is expected to have an entry for every file in the dirstate. In `sqldirstate`, it is stored across two tables: `files` and `nonnormalfiles`. For Eden, we already represent the data analogous to the `files` table in RocksDB/the overlay, so we do not need to create a new equivalent to the `files` table. We do, however, need an equivalent to the `nonnormalfiles` table, which we store in as Thrift-serialized data in an ordinary file along with the `_copymap` data. In our Hg extension, our implementation of `_map` is `eden_dirstate_map`, which is defined in a Python file of the same name. Our implementation of `_copymap` is `dummy_copymap`, which is defined in `eden_dirstate.py`. Both of these collections are simple pass-through data structures that translate their method calls to Thrift server calls. I expect we will want to optimize this in the future via some client-side caching, as well as creating batch APIs for talking to the server via Thrift. One advantage of this new implementation is that it enables us to delete `eden/hg/eden/overrides.py`, which overrode the entry points for `hg add` and `hg remove`. Between the recent implementation of `dirstate.walk()` for Eden and this switch to the real dirstate, we can now use the default implementation of `hg add` and `hg remove` (although we have to play some tricks, like in the implementation of `eden_dirstate.status()` in order to make `hg remove` work). In the course of doing this revision, I discovered that I had to make a minor fix to `EdenMatchInfo.make_glob_list()` because `hg add foo` was being treated as `hg add foo/**/*` even when `foo` was just a file (as opposed to a directory), in which case the glob was not matching `foo`! I also had to do some work in `eden_dirstate.status()` in which the `match` argument was previously largely ignored. It turns out that `dirstate.py` uses `status()` for a number of things with the `match` specified as a filter, so the output of `status()` must be filtered by `match` accordingly. Ultimately, this seems like work that would be better done on the server, but for simplicity, we're just going to do it in Python, for now. For the reasons explained above, this revision deletes a lot of code `Dirstate.cpp`. As such, `DirstateTest.cpp` does not seem worth refactoring, though the scenarios it was testing should probably be converted to integration tests. At a high level, the role of `DirstatePersistence` has not changed, but the exact data it writes is much different. Its corresponding unit test is also disabled, for now. Note that this revision does not change the name of the file where "dirstate data" is written (this is defined as `kDirstateFile` in `ClientConfig.cpp`), so we should blow away any existing instances of this file once this change lands. (It is still early enough in the project that it does not seem worth the overhead of a proper migration.) The true test of the success of this new approach is the ease with which we can write more integration tests for things like `hg histedit` and `hg graft`. Ideally, these should require very few changes to `eden_dirstate.py`. Reviewed By: simpkins Differential Revision: D5071778 fbshipit-source-id: e8fec4d393035d80f36516ac050cad025dc3ba31
2017-05-26 21:51:30 +03:00
)
# histedit, reordering the stack in a conflict-free way.
histedit = HisteditCommand()
histedit.pick(self._commit1)
histedit.roll(self._commit2)
histedit.roll(self._commit3)
histedit.run(self)
self.assertEqual(['first commit'], self.repo.log('{desc}'))
Reimplement dirstate used by Eden's Hg extension as a subclass of Hg's dirstate. Summary: This is a major change to Eden's Hg extension. Our initial attempt to implement `edendirstate` was to create a "clean room" implementation that did not share code with `mercurial/dirstate.py`. This was helpful in uncovering the subset of the dirstate API that matters for Eden. It also provided a better safeguard against upstream changes to `dirstate.py` in Mercurial itself. In this implementation, the state transition management was mostly done on the server in `Dirstate.cpp`. We also made a modest attempt to make `Dirstate.cpp` "SCM-agnostic" such that the same APIs could be used for Git at some point. However, as we have tried to support more of the sophisticated functionality in Mercurial, particularly `hg histedit`, achieving parity between the clean room implementation and Mercurial's internals has become more challenging. Ultimately, the clean room implementation is likely the right way to go for Eden, but for now, we need to prioritize having feature parity with vanilla Hg when using Eden. Once we have a more complete set of integration tests in place, we can reimplement Eden's dirstate more aggressively to optimize things. Fortunately, the [[ https://bitbucket.org/facebook/hg-experimental/src/default/sqldirstate/ | sqldirstate ]] extension has already demonstrated that it is possible to provide a faithful dirstate implementation that subclasses the original `dirstate` while using a different storage mechanism. As such, I used `sqldirstate` as a model when implementing the new `eden_dirstate` (distinguishing it from our v1 implementation, `edendirstate`). In particular, `sqldirstate` uses SQL tables as storage for the following private fields of `dirstate`: `_map`, `_dirs`, `_copymap`, `_filefoldmap`, `_dirfoldmap`. Because `_filefoldmap` and `_dirfoldmap` exist to deal with case-insensitivity issues, we do not support them in `eden_dirstate` and add code to ensure the codepaths that would access them in `dirstate` never get exercised. Similarly, we also implemented `eden_dirstate` so that it never accesses `_dirs`. (`_dirs` is a multiset of all directories in the dirstate, which is an O(repo) data structure, so we do not want to maintain it in Eden. It appears to be primarily used for checking whether a path to a file already exists in the dirstate as a directory. We can protect against that in more efficient ways.) That leaves only `_map` and `_copymap` to worry about. `_copymap` contains the set of files that have been marked "copied" in the current dirstate, so it is fairly small and can be stored on disk or in memory with little concern. `_map` is a bit trickier because it is expected to have an entry for every file in the dirstate. In `sqldirstate`, it is stored across two tables: `files` and `nonnormalfiles`. For Eden, we already represent the data analogous to the `files` table in RocksDB/the overlay, so we do not need to create a new equivalent to the `files` table. We do, however, need an equivalent to the `nonnormalfiles` table, which we store in as Thrift-serialized data in an ordinary file along with the `_copymap` data. In our Hg extension, our implementation of `_map` is `eden_dirstate_map`, which is defined in a Python file of the same name. Our implementation of `_copymap` is `dummy_copymap`, which is defined in `eden_dirstate.py`. Both of these collections are simple pass-through data structures that translate their method calls to Thrift server calls. I expect we will want to optimize this in the future via some client-side caching, as well as creating batch APIs for talking to the server via Thrift. One advantage of this new implementation is that it enables us to delete `eden/hg/eden/overrides.py`, which overrode the entry points for `hg add` and `hg remove`. Between the recent implementation of `dirstate.walk()` for Eden and this switch to the real dirstate, we can now use the default implementation of `hg add` and `hg remove` (although we have to play some tricks, like in the implementation of `eden_dirstate.status()` in order to make `hg remove` work). In the course of doing this revision, I discovered that I had to make a minor fix to `EdenMatchInfo.make_glob_list()` because `hg add foo` was being treated as `hg add foo/**/*` even when `foo` was just a file (as opposed to a directory), in which case the glob was not matching `foo`! I also had to do some work in `eden_dirstate.status()` in which the `match` argument was previously largely ignored. It turns out that `dirstate.py` uses `status()` for a number of things with the `match` specified as a filter, so the output of `status()` must be filtered by `match` accordingly. Ultimately, this seems like work that would be better done on the server, but for simplicity, we're just going to do it in Python, for now. For the reasons explained above, this revision deletes a lot of code `Dirstate.cpp`. As such, `DirstateTest.cpp` does not seem worth refactoring, though the scenarios it was testing should probably be converted to integration tests. At a high level, the role of `DirstatePersistence` has not changed, but the exact data it writes is much different. Its corresponding unit test is also disabled, for now. Note that this revision does not change the name of the file where "dirstate data" is written (this is defined as `kDirstateFile` in `ClientConfig.cpp`), so we should blow away any existing instances of this file once this change lands. (It is still early enough in the project that it does not seem worth the overhead of a proper migration.) The true test of the success of this new approach is the ease with which we can write more integration tests for things like `hg histedit` and `hg graft`. Ideally, these should require very few changes to `eden_dirstate.py`. Reviewed By: simpkins Differential Revision: D5071778 fbshipit-source-id: e8fec4d393035d80f36516ac050cad025dc3ba31
2017-05-26 21:51:30 +03:00
self.assert_status_empty()
self.assertSetEqual(
set(['.eden', '.hg', 'first', 'second', 'third']),
set(os.listdir(self.repo.get_canonical_root()))
)
def test_abort_after_merge_conflict(self):
self.write_file('will_have_confict.txt', 'original\n')
self.hg('add', 'will_have_confict.txt')
commit4 = self.repo.commit('commit4')
self.write_file('will_have_confict.txt', '1\n')
commit5 = self.repo.commit('commit5')
self.write_file('will_have_confict.txt', '2\n')
commit6 = self.repo.commit('commit6')
histedit = HisteditCommand()
histedit.pick(commit4)
histedit.pick(commit6)
histedit.pick(commit5)
original_commits = self.repo.log()
with self.assertRaises(hgrepo.HgError) as context:
histedit.run(self, ancestor=commit4)
expected_msg = (
('Fix up the change (pick %s)\n' % commit6[:12]) +
' (hg histedit --continue to resume)'
)
self.assertIn(expected_msg, str(context.exception))
self.assert_status({
'will_have_confict.txt': 'M',
})
expected_contents_with_conflict_markers = dedent(
'''\
<<<<<<< local
original
=======
2
>>>>>>> histedit
'''
)
self.assertEqual(
expected_contents_with_conflict_markers,
self.read_file('will_have_confict.txt')
)
self.hg('histedit', '--abort')
self.assertEqual('2\n', self.read_file('will_have_confict.txt'))
self.assertListEqual(
original_commits,
self.repo.log(),
msg='The original commit hashes should be restored by the abort.'
)
self.assert_status_empty()