With this change, when tree manifests are enabled (in .hg/requires),
commits will be written with one manifest revlog per directory. The
manifest revlogs are stored in
.hg/store/meta/$dir/00manifest.[id].
Flat manifests can still be read and interacted with as usual (they
are also read into treemanifest instances). The functionality for
writing treemanifest as a flat manifest to disk is still left in the
code; tests still pass with '_treeinmem=True' hardcoded.
Exchange is not yet implemented.
When encoding long paths, the pure Python code strips the first
directory from the path, while the native code currently strips the
first 5 characters. This discrepancy has not been a problem so far,
since we have not stored anything in directories other than
data/. However, we will soon be storing submanifest revlogs in
metadata/, so the discrepancy will have to go [1]. Since file
collisions are avoided by the hashing alone (which is done on the full
unencoded path), it doesn't really matter whether we drop the first
dir, the first 5 characters, or special-case non-data/. To avoid
touching the C code, let's always strip the first 5 characters.
[1] Or maybe elsewhere, but the discrepancy is ugly either way.
Previously the fncache was cleaned up at read time by noticing when it was out
of sync. This caused writes to happen outside the scope of transactions and
could have caused race conditions. With this change, we'll keep the fncache
up-to-date as we go by removing old entries during repair.strip.
The fncache was not being properly invalidated each time the lock was taken, so
in theory it could contain old data from prior to the caller having the lock.
This changes it to be invalidated as soon as the lock is taken (same as all our
other caches).
Previously the fncache was written at lock.release time. This meant it was not
tracked by a transaction, and if an error occurred during the fncache write it
would fail to update the fncache, but would not rollback the transaction,
resulting in an fncache that was not in sync with the files on disk (which
causes verify to fail, and causes streaming clones to not copy all the revlogs).
This uses the new transaction backup mechanism to make the fncache transacted.
It also moves the fncache from being written at lock.release time, to being
written at transaction.close time.
The fncache could rewrite itself during a read operation if it noticed any
entries that were no longer on disk. This was problematic because it caused
Mercurial to perform write operations outside the scope of a lock or
transaction, which could interefere with any other pending writes.
This will be replaced in a future patch by logic that cleans up the fncache
as files are deleted during strips.
Some extensions find it useful to be able to walk the non-data files in the
repo, so I'm moving that part of the walk to a separate function.
In particular, this allows an extension to interact with only the non-filelog
store data (for instance, when cloning everything but filelogs).
The check pattern only checked for whitespace between keyword and operator.
Now it also warns:
> x = f(),7
missing whitespace after ,
> x = f()+7
missing whitespace in expression
This patch invokes "osutil.listdir()" via vfs object.
The function added newly to "abstractvfs" is named not as "listdir()"
but as "readdir()", because:
- "os.listdir()" seems to be more familiar as "listdir()" than
"osutil.listdir()"
- "osutil.listdir()" returns also type of each files like
"readdir()" POSIX API: even though "d_type" field of "dirent"
structure is defined mainly only on BSD/Linux
This patch invokes "osutil.listdir()" via "rawvfs" object to avoid
filename encoding, because the path passed to "osutil.listdir()"
shouldn't be encoded.
This patch also omits importing "osutil" module, because it is no
longer used.
Adds a __contains__ method to fncachestore to check for file/dir existence (using fncache.__contains__).
Also extends fncache.__contains__ to check for directories (by prefix matching)
This patch invokes "os.path.isdir()" via "rawvfs" object to avoid
filename encoding, because the path passed to "os.path.isdir()"
shouldn't be encoded.
This patch newly adds "self.rawvfs" field only to "basicstore" and
"encodedstore", because "fncachestore" has "self.rawvfs" already.
This patch replaces invocation of "getsize()", which calls "os.stat()"
internally, by "vfs.stat()".
The object referred by "self.rawvfs" is used internally by
"_fncachevfs" and doesn't encode filename for each file API invocation.
This patch invokes "os.stat()" via "self.rawvfs" to avoid redundant
filename encoding: invocation of "os.stat()" via "self.vfs" hides
filename encoding and encoding result from caller, so it is not
appropriate, when both encoded and non-encoded filenames should be
yield.
Even though changeset 89eacc6262af improved stream_out performance by
"self.pathsep + path", this patch replaces it by
"os.path.join(self.base, path)" of vfs. So, this may increase cost to
join path components.
But this shouldn't have large impact, because:
- such cost is much less than cost of "os.stat()" which causes
system call invocation
- "datafiles()" of store object is invoked only for "hg manifest
--all" or "hg verify" which are both heavy functions
This just replaces "os.stat()" invocation: refactoring around
"self.createmode" and "vfs.createmode" initialization is omitted.
This patch also newly adds "stat()" function to "abstractvfs".
This patch defines "join()" in each classes derived from "abstractvfs"
except "vfs", which already defines it.
This allows all vfs instances to be used for indirect file API
invocation.
This patch initializes "vfs" field in the constructor of each store
classes to use it for initialization of others.
In this patch, "self.vfs.base" is used to initialize "self.path",
because redo join of path components for "self.path" is redundant.
If the input path is already longer than _maxstorepathlen, then we can skip
doing the basic encoding (encodedir, _encodefname and _auxencode) and directly
proceed to the hashed encoding. Those encodings, if at all, will make the path
only longer.
For backwards compatibility, aliases for the old names are added,
except for "abstractopener", "statichttpopener" and "_fncacheopener",
because these are not used in Mercurial core implementation after this
patch.
"_fncacheopener" was only referred in "fncachestore" constructor, so
this patch also renames from "_fncacheopener" to "_fncachevfs" there.
For a netbeans clone on Windows 7 x64:
Before:
$ hg perffncacheencode
! wall 3.516000 comb 3.525623 user 3.525623 sys 0.000000 (best of 3)
After:
$ hg perffncacheencode
! wall 3.443000 comb 3.447622 user 3.447622 sys 0.000000 (best of 3)
For a netbeans clone on Windows 7 x64:
Before:
$ hg perffncacheload
! wall 0.124000 comb 0.124801 user 0.124801 sys 0.000000 (best of 76)
After:
$ hg perffncacheload
! wall 0.096000 comb 0.093601 user 0.078001 sys 0.015600 (best of 97)
For a netbeans clone on Windows 7 x64:
Before:
$ hg perffncachewrite
! wall 0.210000 comb 0.218401 user 0.202801 sys 0.015600 (best of 47)
After:
$ hg perffncachewrite
! wall 0.104000 comb 0.109201 user 0.078000 sys 0.031200 (best of 95)