As we were trying to transition off of the non production lfs-test-server for
further experimenting, one of the problems we ran into was interoperability. A
coworker setup gitbucket[1] to act as the blob server, tested with git, and
passed it off to me. But push failed with a message saying "abort: LFS server
returns invalid JSON:", and then proceeded to dump a huge HTML page to the
screen. It turns out that it is assuming that git is the only thing that wants
to do a blob transfer, and everything else is a web browser wanting HTML.
It's only a single data point, but I suspect other things may be doing this too.
RFC7231 gives an example [2] of listing multiple products in decreasing order of
significance. Since the standard provides for this, and since it works with the
one problematic server I found, I'm just enabling this by default for a better
UX.
There's nothing significant about the version of git chosen, other than it is
the current version.
[1] https://github.com/gitbucket/gitbucket/
[2] https://tools.ietf.org/html/rfc7231#page-46
Make 'hg outgoing' respect "paths.default:pushurl" in addition to
"paths.default-push".
'hg outgoing' has always meant "what will happen if I run 'hg push'?" and it's
still documented that way:
Show changesets not found in the specified destination repository or the
default push location. These are the changesets that would be pushed if a
push was requested.
If the user uses the now-deprecated "paths.default-push" path, it continues to
work that way. However, as described at
https://bz.mercurial-scm.org/show_bug.cgi?id=5365, it doesn't behave the same
with "paths.default:pushurl".
Why does it matter? Similar to the bugzilla reporter, I have a read-only mirror
of a non-Mercurial repository:
upstream -> imported mirror -> user clone
^-----------------------/
Users push directly to upstream, and that content is then imported into the
mirror. However, those repositories are not the same; it's possible that the
mirroring has either broken completely, or an import process is running and not
yet complete. In those cases, 'hg outgoing' will list changesets that have
already been pushed.
Mozilla's desired behavior described in bug 5365 can be accomplished through
other means (e.g. 'hg outgoing default'), preserving the consistency and
meaning of 'hg outgoing'.
We already have the ability to customize the ssh command line arguments, let's
add the ability to customize its environment as well.
Example use-case is ssh.exe from Git on Windows. If `HOME` enviroment variable
is present and has some non-empty value, ssh.exe will try to access that
location for some stuff (for example, it seems for resolving `~` in
`.ssh/config`). Git for Windows seems to sometimess set this variable to the
value of `/home/username` which probably works under Git Bash, but does not
work in a native `cmd.exe` or `powershell`. Whatever the root cause, setting
`HOME` to be an empty string heals things. Therefore, some distributors
might want to set `sshenv.HOME=` in the configuration (seems less intrusive
that forcing everyone to tweak their env).
Test Plan:
- rt
Differential Revision: https://phab.mercurial-scm.org/D1683
The windows workers weren't daemons and were not correctly killed when ctrl-c'd from the terminal. Withi this change when the main thread is killed, all daemons get killed as well.
I also reduced the time we give to workers to cleanup nicely to not have people ctrl-c'ing when they get inpatient.
The output when threads clened up nicely:
PS C:\<dir>> hg.exe sparse --disable-profile SparseProfiles/<profile>.sparse
interrupted!
The output when threads don't clenup in 1 sec:
PS C:\<dir> hg.exe sparse --enable-profile SparseProfiles/<profile>.sparse
failed to kill worker threads while handling an exception
interrupted!
Exception in thread Thread-4 (most likely raised during interpreter shutdown):
PS C:\<dir>>
Test Plan:
Run hg command on windows (pull/update/sparse). Ctrl-C'd sparse --enable-profile command that was using threads and observed in proces explorer that all threads got killed.
ran tests on CentOS
Differential Revision: https://phab.mercurial-scm.org/D1564
This adds config to disable/enable workers with default being enabled.
Test Plan:
enabled profile without updaing .hg/hgrc (the default should be to use workers) and ran
hg sprase --enable-profile <profile>.sparse
Watched in the proces explorer that hg started 12 new threads for materializing files (this is my worker.numcpus) value
Added
[worker]
enabled = False
to the .hg/hgrc and re ran the command. This time hg didn't spawn any new threads for matreializing of files
Differential Revision: https://phab.mercurial-scm.org/D1460
This adds handling of exceptions from worker threads and resurfaces them as if the function ran without workers.
If any of the threads throws, the main thread kills all running threads giving them 5 sec to handle the interruption and raises the first exception received.
We don't have to join threads if is_alive() is false
Test Plan:
Ran multiple updates/enable/disable sparse profile and things worked well
Ran test on CentOS- all tests passing on @ passed here
Added a forged exception into the worker code and got it properly resurfaced and the rest of workers killed: P58642088
PS C:\open\<repo>> ..\facebook-hg-rpms\build\hg\hg.exe --config extensions.fsmonitor=! sparse --enable-profile <profile>
updating [==> ] 1300/39166 1m57sException in thread Thread-3:
Traceback (most recent call last):
File "C:\open\facebook-hg-rpms\build\hg\hg-python\lib\threading.py", line 801, in __bootstrap_inner
self.run()
File "C:\open\facebook-hg-rpms\build\hg\mercurial\worker.py", line 244, in run
raise e
Exception: Forged exception
Exception in thread Thread-2:
Traceback (most recent call last):
File "C:\open\facebook-hg-rpms\build\hg\hg-python\lib\threading.py", line 801, in __bootstrap_inner
self.run()
File "C:\open\facebook-hg-rpms\build\hg\mercurial\worker.py", line 244, in run
raise e
Exception: Forged exception
<...>
Traceback (most recent call last):
File "C:\open\facebook-hg-rpms\build\hg\hgexe.py", line 41, in <module>
dispatch.run()
File "C:\open\facebook-hg-rpms\build\hg\mercurial\dispatch.py", line 85, in run
status = (dispatch(req) or 0) & 255
File "C:\open\facebook-hg-rpms\build\hg\mercurial\dispatch.py", line 173, in dispatch
ret = _runcatch(req)
File "C:\open\facebook-hg-rpms\build\hg\mercurial\dispatch.py", line 324, in _runcatch
return _callcatch(ui, _runcatchfunc)
File "C:\open\facebook-hg-rpms\build\hg\mercurial\dispatch.py", line 332, in _callcatch
return scmutil.callcatch(ui, func)
File "C:\open\facebook-hg-rpms\build\hg\mercurial\scmutil.py", line 154, in callcatch
return func()
File "C:\open\facebook-hg-rpms\build\hg\mercurial\dispatch.py", line 314, in _runcatchfunc
return _dispatch(req)
File "C:\open\facebook-hg-rpms\build\hg\mercurial\dispatch.py", line 951, in _dispatch
cmdpats, cmdoptions)
File "C:\open\facebook-hg-rpms\build\hg\hg-python\lib\site-packages\remotefilelog\__init__.py", line 415, in runcommand
return orig(lui, repo, *args, **kwargs)
File "C:\open\facebook-hg-rpms\build\hg\hg-python\lib\site-packages\hgext3rd\undo.py", line 118, in _runcommandwrapper
result = orig(lui, repo, cmd, fullargs, *args)
File "C:\open\facebook-hg-rpms\build\hg\hgext\journal.py", line 84, in runcommand
return orig(lui, repo, cmd, fullargs, *args)
File "C:\open\facebook-hg-rpms\build\hg\hg-python\lib\site-packages\hgext3rd\perftweaks.py", line 268, in _tracksparseprofiles
res = runcommand(lui, repo, *args)
File "C:\open\facebook-hg-rpms\build\hg\hg-python\lib\site-packages\hgext3rd\perftweaks.py", line 256, in _trackdirstatesizes
res = runcommand(lui, repo, *args)
File "C:\open\facebook-hg-rpms\build\hg\hg-python\lib\site-packages\hgext3rd\copytrace.py", line 144, in _runcommand
return orig(lui, repo, cmd, fullargs, ui, *args, **kwargs)
File "C:\open\facebook-hg-rpms\build\hg\hg-python\lib\site-packages\hgext3rd\fbamend\hiddenoverride.py", line 119, in runcommand
result = orig(lui, repo, cmd, fullargs, *args)
File "C:\open\facebook-hg-rpms\build\hg\mercurial\dispatch.py", line 712, in runcommand
ret = _runcommand(ui, options, cmd, d)
File "C:\open\facebook-hg-rpms\build\hg\mercurial\dispatch.py", line 959, in _runcommand
return cmdfunc()
File "C:\open\facebook-hg-rpms\build\hg\mercurial\dispatch.py", line 948, in <lambda>
d = lambda: util.checksignature(func)(ui, *args, **strcmdopt)
File "C:\open\facebook-hg-rpms\build\hg\mercurial\util.py", line 1183, in check
return func(*args, **kwargs)
File "C:\open\facebook-hg-rpms\build\hg\hg-python\lib\site-packages\hgext3rd\fbsparse.py", line 860, in sparse
disableprofile=disableprofile, force=force)
File "C:\open\facebook-hg-rpms\build\hg\hg-python\lib\site-packages\hgext3rd\fbsparse.py", line 949, in _config
len, _refresh(ui, repo, oldstatus, oldsparsematch, force))
File "C:\open\facebook-hg-rpms\build\hg\hg-python\lib\site-packages\hgext3rd\fbsparse.py", line 1116, in _refresh
mergemod.applyupdates(repo, typeactions, repo[None], repo['.'], False)
File "C:\open\facebook-hg-rpms\build\hg\hg-python\lib\site-packages\remotefilelog\__init__.py", line 311, in applyupdates
return orig(repo, actions, wctx, mctx, overwrite, labels=labels)
File "C:\open\facebook-hg-rpms\build\hg\mercurial\merge.py", line 1464, in applyupdates
for i, item in prog:
File "C:\open\facebook-hg-rpms\build\hg\mercurial\worker.py", line 286, in _windowsworker
raise t.exception
Exception: Forged exception
PS C:\open\ovrsource>
Differential Revision: https://phab.mercurial-scm.org/D1459
This change implements thread based worker on windows.
The handling of exception from within threads will happen in separate diff.
The worker is for now used in mercurial/merge.py and in lfs extension
After multiple tests and milions of files materiealized, thousands lfs fetched
it seems that neither merge.py nor lfs/blobstore.py is thread unsafe. I also
looked through the code and besides the backgroundfilecloser (handled in base
of this) things look good.
The performance boost of this on windows is
~50% for sparse --enable-profile
* Speedup of hg up/rebase - not exactly measured
Test Plan:
Ran 10s of hg sparse --enable-profile and --disable-profile operations on large profiles and verified that workers are running. Used sysinternals suite to see that all threads are spawned and run as they should
Run various other operations on the repo including update and rebase
Ran tests on CentOS and all tests that pass on @ pass here
Differential Revision: https://phab.mercurial-scm.org/D1458
This disables background file closing when in not in main thread
Test Plan:
Ran pull, update, sparse commands and watched the closer threads created and destroyed in procexp.exe
ran test on CentOS. No tests broken compared to the base
Differential Revision: https://phab.mercurial-scm.org/D1457
This does a few things:
* Changes "-r" to "--rev", since "-r" is not a valid short form
* Removes non-existent "-l" and "-b" options
* Removes "..." after options, since we don't usually have that
Differential Revision: https://phab.mercurial-scm.org/D1706
I may add an alternative way of getting copy metadata (from changelog,
not filelog) but the chaining with the dirstate copy metadata will be
the same, so it will probably help to have this extracted. Even if
that doesn't happen, the next patch will show that we can simplify
this a bit after this refactoring, so it seems worth it regardless.
Differential Revision: https://phab.mercurial-scm.org/D1697
The function would ignore the matcher if the dirstate copies were
requested. It doesn't matter in practice because all callers used the
returned map only for looking up specific files from and those files
had already been filtered by the matcher (AFACT). Still, it's a little
confusing, so let's make it clearer by respecting the matcher in this
case too.
Differential Revision: https://phab.mercurial-scm.org/D1695
It seems like it didn't even exist when debugdiscovery was introduced
in 43f4c1113c8d (discovery: add new set-based discovery, 2011-05-02).
Differential Revision: https://phab.mercurial-scm.org/D1693
It seems like it didn't even exist when debugdiscovery was introduced
in 43f4c1113c8d (discovery: add new set-based discovery, 2011-05-02).
Differential Revision: https://phab.mercurial-scm.org/D1692
It seems like it didn't even exist when debugdiscovery was introduced
in 43f4c1113c8d (discovery: add new set-based discovery, 2011-05-02).
Differential Revision: https://phab.mercurial-scm.org/D1691
Once upon a time, in 1995, there were browsers that didn't understand <script>
tags and they would simply show the code inside as text. This started a
tradition of wrapping everything inside <script> in <!-- HTML comments -->.
Nowadays, it's not only not needed, but can be considered harmful[1]:
- within XHTML documents, the source will actually be hidden from all browsers
and rendered useless
- `--` is not allowed within HTML comments, so any decrement operations in
script are invalid
[1]: http://www.javascripttoolbox.com/bestpractices/#comments
In large repositories, updates involving the creation of many files check the
same directories repeatedly in the wctx manifest. Move these checks out to a
separate loop to avoid repeated checks hitting the manifest.
Differential Revision: https://phab.mercurial-scm.org/D1226
As mentioned in D1222, the recent pathconflicts change regresses update
performance in large repositories when many files are being updated.
To mitigate this, we introduce two caches of directories that have
already found to be either:
- unknown directories, but which are not aliased by files and
so don't need to be checked if they are files again; and
- missing directores, which cannot cause path conflicts, and
cannot contain a file that causes a path conflict.
When checking the paths of a file, testing against this caches means we can
skip tests that involve touching the filesystem.
Differential Revision: https://phab.mercurial-scm.org/D1224
If this feature is enabled, early options are parsed using the global options
table. As the parser stops processing options when non/unknown option is
encountered, it won't mistakenly take an option value as a new early option.
Still "--" can be injected to terminate the parsing (e.g. "hg -R -- log"), I
think it's unlikely to lead to an RCE.
To minimize a risk of this change, new fancyopts.earlygetopt() path is enabled
only when +strictflags is set. Also the strict parser doesn't support '--repo',
a short for '--repository' yet. This limitation will be removed later.
As this feature is backward incompatible, I decided to add a new opt-in
mechanism to HGPLAIN. I'm not pretty sure if this is the right choice, but
I'm thinking of adding +feature/-feature syntax to HGPLAIN. Alternatively,
we could add a new environment variable. Any bikeshedding is welcome.
Note that HGPLAIN=+strictflags doesn't work correctly in chg session since
command arguments are pre-processed in C. This wouldn't be easily fixed.
The next patch will add a flag for strict parsing of early options, where
we'll have to parse all early options at once instead of processing them
one-by-one by dispatch._earlygetopt(). That's why I decided to hook
fancyopts().
All dispatch._early*opt() functions is planned to be replaced with this
function. But in this stable series, only the strict mode will be handled
by fancyopts.earlygetopt().
The `hg lfconvert --to-normal` command uses the convert extension internally to
work its magic, but that produced devel-warn messages if the convert extension
wasn't loaded by the user. The test in 658e7a6d93e0 (modified here) wasn't
showing the warnings because the convert extension was loaded via $HGRCPATH.
Most of the config options default to None/False, but 'hg.usebranchnames' and
'hg.tagsbranch' are supposed to default to True and 'default' respectively.
The first iteration of this was to ui.setconfig() inside lfconvert, to force the
convert extension to load. But there really is no precedent for doing this, and
check-config complained that 'extensions.convert' isn't documented. Yuya
suggested this alternative.
This partially backs out 448e09d8859d.
Repoview can have a different life cycle, causing issue in some corner
cases. The particular instance that revealed this comes from localpeer. The
localpeer hold a reference to the unfiltered repository, but calling 'local()'
will create an on-demand 'visible' repoview. That repoview can be garbaged
collected any time. Here is a simplified step by step reproduction::
1) tr = peer.local().transaction('foo')
2) tr.close()
After (1), the repoview object is garbage collected, so weakref used in (2)
point to nothing.
Thanks to Sean Farley for helping raising and debugging this issue.
Before, early options were stripped from args, and because of this, some
kind of parsing errors weren't reported. For example,
$ hg ci -m -Ra file
would execute "hg ci -m file" in repository "a".
This patch fixes the issue by parsing early options again by real getopt-based
parser, and verifying the results. If the early parsing appears wrong, hg just
aborts. The current error message seems not nice, and should be improved, maybe
in V2 or follow-up.
Note that this isn't a security feature because we can still do anything by
using shell aliases.
So we can easily compare it with the corresponding getopt() result.
There's a minor behavior change. Before, "hg --cwd ''" failed with ENOENT.
But with this patch, an empty cwd is silently ignored. "hg -R ''" has always
worked as such, so -R has no BC.
This allows us to parse the original args later by full-blown getopt() in
order to verify the result of the faulty early parsing. Still we need the
'strip=True' behavior for shell aliases.
Note that this series is RFC because it seems to change too much to be
included in stable release.
Perhaps we'll need to restrict the parsing rules of --debugger and --profile,
where this patch will help us know why the --debugger option doesn't work.
I have another series to extend this feature to --config/--cwd/-R, but even
with that, shell aliases can be used to get around the restriction.
This is a minimal copy of localrepo.commit(). As the current amend() function
heavily depends on the wctx API, it wasn't easy to port it to use a separate
status tuple. So for now, wctx._status is updated in-place.
When origbackuppath is set, when looking to see if a file we are backing up
conflicts with a directory in the origbackuppath, we incorrectly match on
symlinks to directories. This means we try to call vfs.rmtree on the
symlink, which fails.
Differential Revision: https://phab.mercurial-scm.org/D1311
We change subrepos.allowed from a list of allowed subrepo types to
a combination of a master switch and per-type boolean flag.
If the master switch is set, subrepos can be disabled wholesale.
If subrepos are globally enabled, then per-type options are
consulted. Mercurial repos are enabled by default. Everything else
is disabled by default.
These config items control share behavior that is implemented in core.
Since the functionality is implemented in core, extensions may
leverage it.
Mozilla has one such extension. And, it needs to access share.pool.
Before this patch, a devel warning regarding accessing an unregistered
config option would be issued unless the share extension were loaded.
Moving the registration of the config options to core fixes this.
We have a security issue with git subrepos. I'm not sure if svn subrepo is
vulnerable, but it seems not 100% safe to allow writing arbitrary data into
a metadata directory. So for now, only hg subrepo is enabled by default.
Maybe we should improve the help to describe why git/svn subrepos are
disabled.
This allows us to minimize the behavior change introduced by the next patch.
I have no idea which config style is preferred in UX POV, but I decided to
get things done.
a) list: 'allowed = hg, git, svn'
b) sub option: 'allowed.hg = True' or 'allowed:hg = True'
c) per-type action: 'hg = allow', 'git = abort'
This is an alternative workaround for the issue5730.
Perhaps this is the simplest way of disabling subrepo operations. It does
nothing clever, but just aborts if Mercurial starts accessing to a subrepo.
I think Greg's patch is more useful since it allows us to at least check
out the parent repository. However, that would be confusing if the default
is flipped to checkout=False and subrepos are silently ignored.
I don't like the config name 'allowed', but I couldn't get any better name.