A Scalable, User-Friendly Source Control System.
Go to file
Jun Wu d678fe1702 encoding: replace 'ascii' with 'utf-8' automatically
Summary:
`ascii` was used as the default / fallback, which is not a user-friendly choice.
Nowadays utf-8 dominates:

- Rust stdlib is utf-8.
- Ruby since 1.9 is utf-8 by default.
- Python 3 is unicode by default.
- Windows 10 adds utf-8 code page.

Given the fact that:

- Our CI sets HGENCODING to utf-8
- Nuclide passes `--encoding=utf-8` to every command.
- Some people have messed up with `LC_*` and complained about hg crashes.
- utf-8 is a super set of ascii, nobody complains that they want `ascii`
  encoding and the `utf-8` encoding messed their setup up.

Let's just use `utf-8` as the default encoding. More aggressively, if someone
sets `ascii` as the encoding, it's almost always a mistake. Auto-correct that
to `utf-8` too.

This should also make future integration with Rust easier (where it's enforced
utf-8 and does not have an option to change the encoding). In the future we
might just drop the flexibility of choosing customized encoding, so this diff
autofixes `ascii` to `utf-8`, instead of allowing `ascii` to be set. We cannot
enforce `utf-8` yet, because of Windows.

Here is our encoding strategy vs the upstream's:

| item           | upstream |          | ours          | ours  |
|                | current  | ideal    | current       | ideal |
| CLI argv       | bytes    | bytes    | utf-8 [1]     | utf-8 |
| path           | bytes    | auto [3] | migrating [2] | utf-8 |
| commit message | utf-8    | utf-8    | utf-8         | utf-8 |
| bookmark name  | utf-8    | utf-8    | utf-8         | utf-8 |
| file content   | bytes    | bytes    | bytes         | bytes |

[1]: Argv was accidentally enforced utf-8 for command-line arguments by a Rust
wrapper.  But it simplified a lot of things and is kind of ok: everything that
can be passed as CLI arguments are utf-8: -M commit message, -b bookmark, paths,
etc. There is no "file content" passed via CLI arguments.

[2]: Path is controversial, because it's possible for systems to have non-utf8
paths. The upstream behavior is incorrect if a repo gets shared among different
encoding systems (ex. both Linux and Windows). We have to know the encoding of
paths to be able to convert them suitable for the local system. One way is to
enforce UTF-8 for paths. The other is to keep encoding information stored with
individual paths (like Ruby strings). The UTF-8 approach is much simpler with
the tradeoff that non-utf-8 paths become unsupported, which seems to be a
reasonable trade-off.

[3]: See https://www.mercurial-scm.org/wiki/WindowsUTF8Plan.

Reviewed By: singhsrb

Differential Revision: D17098991

fbshipit-source-id: c0ff1e586a887233bd43cdb854fb3538aa9b70c2
2019-09-12 15:06:36 -07:00
contrib Changing commit hash length to 8 in hg prompt 2019-09-12 03:28:36 -07:00
distutils_rust distutils_rust: workaround a 'cc' deadlock issue 2019-09-10 10:49:52 -07:00
doc doc: remove unused doc build step 2019-05-14 15:09:42 -07:00
edenscm encoding: replace 'ascii' with 'utf-8' automatically 2019-09-12 15:06:36 -07:00
edenscmnative bindings: split the crate into multiple crates 2019-09-12 10:51:07 -07:00
exec Delete extern crate lines 2019-09-11 22:02:16 -07:00
i18n ui: add labelled prefixes to ui.write 2019-05-09 06:55:11 -07:00
lib remotefilelog: add prefetch method to remotetreestore 2019-09-12 10:24:46 -07:00
newdoc doc: update WritingNativeCommands 2019-08-28 19:26:28 -07:00
slides slides: recompile with newer tex toolchain 2019-04-18 13:50:03 -07:00
tests encoding: replace 'ascii' with 'utf-8' automatically 2019-09-12 15:06:36 -07:00
.editorconfig move scm/hg/.clang-format to scm/hg/mercurial/ 2018-05-25 14:35:51 -07:00
.flake8 codemod: join the auto-formatter party 2018-05-25 22:17:29 -07:00
.gitignore setup: move native extensions to edenscmnative 2019-06-19 17:55:49 -07:00
.hgsigs Added signature for changeset f51ae48a3fd9 2017-12-01 13:49:47 -06:00
.jshintrc hgweb: add .jshintrc with some basic rules 2017-11-22 22:18:06 +08:00
CONTRIBUTING contributing: add new file with a pointer to the wiki 2016-10-08 10:39:00 -04:00
CONTRIBUTORS Add note to CONTRIBUTORS file 2007-11-07 21:10:30 -06:00
COPYING COPYING: refresh with current address from fsf.org 2011-06-02 11:17:02 -05:00
gen_version.py generate __version__.py during the buck build 2018-06-25 15:52:25 -07:00
hgeditor spelling: trivial spell checking 2015-10-17 00:58:46 +02:00
hgweb.cgi codemod: import from the edenscm package 2019-01-29 17:25:32 -08:00
Makefile makefile: remove 'hg version' invocation in make local 2019-08-28 13:54:57 -07:00
README.rst doc: rename README to README.rst 2017-09-26 08:37:17 +02:00
setup.py bindings: split the crate into multiple crates 2019-09-12 10:51:07 -07:00

Mercurial
=========

Mercurial is a fast, easy to use, distributed revision control tool
for software developers.

Basic install::

 $ make            # see install targets
 $ make install    # do a system-wide install
 $ hg debuginstall # sanity-check setup
 $ hg              # see help

Running without installing::

 $ make local      # build for inplace usage
 $ ./hg --version  # should show the latest version

See https://mercurial-scm.org/ for detailed installation
instructions, platform-specific notes, and Mercurial user information.