When encoding long paths, the pure Python code strips the first
directory from the path, while the native code currently strips the
first 5 characters. This discrepancy has not been a problem so far,
since we have not stored anything in directories other than
data/. However, we will soon be storing submanifest revlogs in
metadata/, so the discrepancy will have to go [1]. Since file
collisions are avoided by the hashing alone (which is done on the full
unencoded path), it doesn't really matter whether we drop the first
dir, the first 5 characters, or special-case non-data/. To avoid
touching the C code, let's always strip the first 5 characters.
[1] Or maybe elsewhere, but the discrepancy is ugly either way.
- Mac OS X has problems with filenames starting with '._'
(e.g. '.FOO' -> '._f_o_o' is now encoded as '~2e_f_o_o')
- Explorer of Windows Vista and Windows 7 strip leading spaces of
path elements of filenames when copying trees
Above problems are avoided by encoding the first space (as '~20') or
period (as '~2e') of all path elements.
This introduces a new entry 'dotencode' in .hg/requires, that is,
a new repository filename layout (inside .hg/store).
Newly created repositories require 'dotencode' by default. Specifying
[format]
dotencode = False
in a config file will use the old format instead.
Prior Mercurial versions will abort with the message
abort: requirement 'dotencode' not supported!
when trying to access a local repository that requires 'dotencode'.
New 'dotencode' repositories can be converted to the previous
repository format with
hg --config format.dotencode=0 clone --pull repoA repoB
Windows won't create directories with names ending in period or space, so
we encode the last period/space character in directory names of non-hashed
paths in the store using reversible ~xx encoding (' ' -> '~20', '.' -> '~2e').
With this change it is possible to remove a directory ending in period or space
that was inadvertantly checked in on a linux system while still being able
to clone such a repository with its full history to Windows (see also issue793).
Windows won't create directories with names ending in period or space, so
we replace the last period/space character in truncated directory names of
hashed paths with some other character (underbar).