mirror of
https://github.com/facebook/sapling.git
synced 2024-10-11 09:17:30 +03:00
d678fe1702
Summary: `ascii` was used as the default / fallback, which is not a user-friendly choice. Nowadays utf-8 dominates: - Rust stdlib is utf-8. - Ruby since 1.9 is utf-8 by default. - Python 3 is unicode by default. - Windows 10 adds utf-8 code page. Given the fact that: - Our CI sets HGENCODING to utf-8 - Nuclide passes `--encoding=utf-8` to every command. - Some people have messed up with `LC_*` and complained about hg crashes. - utf-8 is a super set of ascii, nobody complains that they want `ascii` encoding and the `utf-8` encoding messed their setup up. Let's just use `utf-8` as the default encoding. More aggressively, if someone sets `ascii` as the encoding, it's almost always a mistake. Auto-correct that to `utf-8` too. This should also make future integration with Rust easier (where it's enforced utf-8 and does not have an option to change the encoding). In the future we might just drop the flexibility of choosing customized encoding, so this diff autofixes `ascii` to `utf-8`, instead of allowing `ascii` to be set. We cannot enforce `utf-8` yet, because of Windows. Here is our encoding strategy vs the upstream's: | item | upstream | | ours | ours | | | current | ideal | current | ideal | | CLI argv | bytes | bytes | utf-8 [1] | utf-8 | | path | bytes | auto [3] | migrating [2] | utf-8 | | commit message | utf-8 | utf-8 | utf-8 | utf-8 | | bookmark name | utf-8 | utf-8 | utf-8 | utf-8 | | file content | bytes | bytes | bytes | bytes | [1]: Argv was accidentally enforced utf-8 for command-line arguments by a Rust wrapper. But it simplified a lot of things and is kind of ok: everything that can be passed as CLI arguments are utf-8: -M commit message, -b bookmark, paths, etc. There is no "file content" passed via CLI arguments. [2]: Path is controversial, because it's possible for systems to have non-utf8 paths. The upstream behavior is incorrect if a repo gets shared among different encoding systems (ex. both Linux and Windows). We have to know the encoding of paths to be able to convert them suitable for the local system. One way is to enforce UTF-8 for paths. The other is to keep encoding information stored with individual paths (like Ruby strings). The UTF-8 approach is much simpler with the tradeoff that non-utf-8 paths become unsupported, which seems to be a reasonable trade-off. [3]: See https://www.mercurial-scm.org/wiki/WindowsUTF8Plan. Reviewed By: singhsrb Differential Revision: D17098991 fbshipit-source-id: c0ff1e586a887233bd43cdb854fb3538aa9b70c2
157 lines
6.2 KiB
Perl
157 lines
6.2 KiB
Perl
$ . helpers-usechg.sh
|
|
|
|
#require svn svn-bindings
|
|
|
|
$ cat >> $HGRCPATH <<EOF
|
|
> [extensions]
|
|
> convert =
|
|
> EOF
|
|
|
|
$ svnadmin create svn-repo
|
|
$ svnadmin load -q svn-repo < "$TESTDIR/svn/encoding.svndump"
|
|
|
|
Convert while testing all possible outputs
|
|
|
|
$ hg --debug convert svn-repo A-hg --config progress.debug=1
|
|
initializing destination A-hg repository
|
|
reparent to file:/*/$TESTTMP/svn-repo (glob)
|
|
run hg sink pre-conversion action
|
|
scanning source...
|
|
found trunk at 'trunk'
|
|
found tags at 'tags'
|
|
found branches at 'branches'
|
|
found branch branch\xc3\xa9 at 5 (esc)
|
|
found branch branch\xc3\xa9e at 6 (esc)
|
|
progress: scanning: 1/4 revisions (25.00%)
|
|
reparent to file:/*/$TESTTMP/svn-repo/trunk (glob)
|
|
fetching revision log for "/trunk" from 4 to 0
|
|
parsing revision 4 (2 changes)
|
|
parsing revision 3 (4 changes)
|
|
parsing revision 2 (3 changes)
|
|
parsing revision 1 (3 changes)
|
|
no copyfrom path, don't know what to do.
|
|
'/branches' is not under '/trunk', ignoring
|
|
'/tags' is not under '/trunk', ignoring
|
|
progress: scanning: 2/4 revisions (50.00%)
|
|
reparent to file:/*/$TESTTMP/svn-repo/branches/branch%C3%A9 (glob)
|
|
fetching revision log for "/branches/branch\xc3\xa9" from 5 to 0 (esc)
|
|
parsing revision 5 (1 changes)
|
|
reparent to file:/*/$TESTTMP/svn-repo (glob)
|
|
reparent to file:/*/$TESTTMP/svn-repo/branches/branch%C3%A9 (glob)
|
|
found parent of branch /branches/branch\xc3\xa9 at 4: /trunk (esc)
|
|
progress: scanning: 3/4 revisions (75.00%)
|
|
reparent to file:/*/$TESTTMP/svn-repo/branches/branch%C3%A9e (glob)
|
|
fetching revision log for "/branches/branch\xc3\xa9e" from 6 to 0 (esc)
|
|
parsing revision 6 (1 changes)
|
|
reparent to file:/*/$TESTTMP/svn-repo (glob)
|
|
reparent to file:/*/$TESTTMP/svn-repo/branches/branch%C3%A9e (glob)
|
|
found parent of branch /branches/branch\xc3\xa9e at 5: /branches/branch\xc3\xa9 (esc)
|
|
progress: scanning: 4/4 revisions (100.00%)
|
|
progress: scanning: 5/4 revisions (125.00%)
|
|
progress: scanning: 6/4 revisions (150.00%)
|
|
progress: scanning (end)
|
|
sorting...
|
|
converting...
|
|
5 init projA
|
|
source: svn:afeb9c47-92ff-4c0c-9f72-e1f6eb8ac9af/trunk@1
|
|
progress: converting: 0/6 revisions (0.00%)
|
|
committing changelog
|
|
4 hello
|
|
source: svn:afeb9c47-92ff-4c0c-9f72-e1f6eb8ac9af/trunk@2
|
|
progress: converting: 1/6 revisions (16.67%)
|
|
reparent to file:/*/$TESTTMP/svn-repo/trunk (glob)
|
|
progress: scanning paths: /trunk/\xc3\xa0 0/3 paths (0.00%) (esc)
|
|
progress: scanning paths: /trunk/\xc3\xa0/e\xcc\x81 1/3 paths (33.33%) (esc)
|
|
progress: scanning paths: /trunk/\xc3\xa9 2/3 paths (66.67%) (esc)
|
|
progress: scanning paths (end)
|
|
progress: getting files: \xc3\xa0/e\xcc\x81 1/2 files (50.00%) (esc)
|
|
progress: getting files: \xc3\xa9 2/2 files (100.00%) (esc)
|
|
committing files:
|
|
\xc3\xa0/e\xcc\x81 (esc)
|
|
\xc3\xa9 (esc)
|
|
committing manifest
|
|
committing changelog
|
|
progress: getting files (end)
|
|
3 copy files
|
|
source: svn:afeb9c47-92ff-4c0c-9f72-e1f6eb8ac9af/trunk@3
|
|
progress: converting: 2/6 revisions (33.33%)
|
|
progress: scanning paths: /trunk/\xc3\xa0 0/4 paths (0.00%) (esc)
|
|
gone from -1
|
|
reparent to file:/*/$TESTTMP/svn-repo (glob)
|
|
reparent to file:/*/$TESTTMP/svn-repo/trunk (glob)
|
|
progress: scanning paths: /trunk/\xc3\xa8 1/4 paths (25.00%) (esc)
|
|
copied to \xc3\xa8 from \xc3\xa9@2 (esc)
|
|
progress: scanning paths: /trunk/\xc3\xa9 2/4 paths (50.00%) (esc)
|
|
gone from -1
|
|
reparent to file:/*/$TESTTMP/svn-repo (glob)
|
|
reparent to file:/*/$TESTTMP/svn-repo/trunk (glob)
|
|
progress: scanning paths: /trunk/\xc3\xb9 3/4 paths (75.00%) (esc)
|
|
mark /trunk/\xc3\xb9 came from \xc3\xa0:2 (esc)
|
|
progress: scanning paths (end)
|
|
progress: getting files: \xc3\xa0/e\xcc\x81 1/4 files (25.00%) (esc)
|
|
progress: getting files: \xc3\xa9 2/4 files (50.00%) (esc)
|
|
progress: getting files: \xc3\xa8 3/4 files (75.00%) (esc)
|
|
progress: getting files: \xc3\xb9/e\xcc\x81 4/4 files (100.00%) (esc)
|
|
committing files:
|
|
\xc3\xa8 (esc)
|
|
\xc3\xa8: copy \xc3\xa9:6b67ccefd5ce6de77e7ead4f5292843a0255329f (esc)
|
|
\xc3\xb9/e\xcc\x81 (esc)
|
|
\xc3\xb9/e\xcc\x81: copy \xc3\xa0/e\xcc\x81:a9092a3d84a37b9993b5c73576f6de29b7ea50f6 (esc)
|
|
committing manifest
|
|
committing changelog
|
|
progress: getting files (end)
|
|
2 remove files
|
|
source: svn:afeb9c47-92ff-4c0c-9f72-e1f6eb8ac9af/trunk@4
|
|
progress: converting: 3/6 revisions (50.00%)
|
|
progress: scanning paths: /trunk/\xc3\xa8 0/2 paths (0.00%) (esc)
|
|
gone from -1
|
|
reparent to file:/*/$TESTTMP/svn-repo (glob)
|
|
reparent to file:/*/$TESTTMP/svn-repo/trunk (glob)
|
|
progress: scanning paths: /trunk/\xc3\xb9 1/2 paths (50.00%) (esc)
|
|
gone from -1
|
|
reparent to file:/*/$TESTTMP/svn-repo (glob)
|
|
reparent to file:/*/$TESTTMP/svn-repo/trunk (glob)
|
|
progress: scanning paths (end)
|
|
progress: getting files: \xc3\xa8 1/2 files (50.00%) (esc)
|
|
progress: getting files: \xc3\xb9/e\xcc\x81 2/2 files (100.00%) (esc)
|
|
committing files:
|
|
committing manifest
|
|
committing changelog
|
|
progress: getting files (end)
|
|
1 branch to branch\xc3\xa9 (esc)
|
|
source: svn:afeb9c47-92ff-4c0c-9f72-e1f6eb8ac9af/branches/branch\xc3\xa9@5 (esc)
|
|
progress: converting: 4/6 revisions (66.67%)
|
|
reparent to file:/*/$TESTTMP/svn-repo/branches/branch%C3%A9 (glob)
|
|
progress: scanning paths: /branches/branch\xc3\xa9 0/1 paths (0.00%) (esc)
|
|
progress: scanning paths (end)
|
|
committing changelog
|
|
0 branch to branch\xc3\xa9e (esc)
|
|
source: svn:afeb9c47-92ff-4c0c-9f72-e1f6eb8ac9af/branches/branch\xc3\xa9e@6 (esc)
|
|
progress: converting: 5/6 revisions (83.33%)
|
|
reparent to file:/*/$TESTTMP/svn-repo/branches/branch%C3%A9e (glob)
|
|
progress: scanning paths: /branches/branch\xc3\xa9e 0/1 paths (0.00%) (esc)
|
|
progress: scanning paths (end)
|
|
committing changelog
|
|
progress: converting (end)
|
|
reparent to file:/*/$TESTTMP/svn-repo (glob)
|
|
reparent to file:/*/$TESTTMP/svn-repo/branches/branch%C3%A9e (glob)
|
|
reparent to file:/*/$TESTTMP/svn-repo (glob)
|
|
reparent to file:/*/$TESTTMP/svn-repo/branches/branch%C3%A9e (glob)
|
|
updating tags
|
|
committing files:
|
|
.hgtags
|
|
committing manifest
|
|
committing changelog
|
|
run hg sink post-conversion action
|
|
$ cd A-hg
|
|
$ hg up
|
|
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
|
|
|
|
Check tags are in UTF-8
|
|
|
|
$ cat .hgtags
|
|
e94e4422020e715add80525e8f0f46c9968689f1 branch\xc3\xa9e (esc)
|
|
f7e66f98380ed1e53a797c5c7a7a2616a7ab377d branch\xc3\xa9 (esc)
|
|
|
|
$ cd ..
|