Commit Graph

84 Commits

Author SHA1 Message Date
Yuya Nishihara
9618d42102 chg: remove outdated rule to start test server
This rule is no longer useful because chg daemon may be killed and respawned
per config/environment hash. We can't reliably run a daemon in foreground.
2017-10-12 22:21:14 +09:00
muxator
292c1fdcc4 build: chg build was failing when the base directory contained spaces 2017-10-11 02:06:12 +02:00
Yuya Nishihara
bac3fece2c chg: just forward --time to command server
Since we've removed the use of atexit in ee4f321cd621, --time just works.
2017-10-07 22:07:10 +09:00
Jun Wu
a05655c958 chg: show timestamp with debug messages
Like `strace -tr`, this helps finding performance bottlenecks.

Differential Revision: https://phab.mercurial-scm.org/D807
2017-09-23 14:58:40 -07:00
Mathias De Maré
cafe7e372c chg: define _GNU_SOURCE to allow CentOS 5 compilation
Without this flag, compilation fails with:
 hgclient.c: In function 'hgc_open':
 hgclient.c:466: error: 'O_DIRECTORY' undeclared (first use in this function)
 hgclient.c:466: error: (Each undeclared identifier is reported only once
 hgclient.c:466: error: for each function it appears in.)

Differential Revision: https://phab.mercurial-scm.org/D260
2017-08-07 13:40:36 +02:00
Jun Wu
488ba17f87 chg: respect environment variables for pager
Previously chg runs the pager command without respecting its environment
variables being told to use. This patch makes it so.
2017-04-12 16:50:23 -07:00
Jun Wu
c81e982932 chg: always wait for pager
Previously, when runcommand raises, chg aborts with, and does not wait for
pager. The call stack is like:

  hgc_runcommand -> handleresponse -> readchannel -> debugmsg("failed to
  read channel") -> exit(255)

That means, chg returns to the shell, then both the pager and the shell will
read from the terminal at the same time, causing problems.

This patch fixes that by using "atexit" to register the pager cleanup
function so chg will always wait for pager even if runcommand raises.
2017-04-11 18:31:40 -07:00
Jun Wu
dc918444b5 chg: forward user-defined signals
SIGUSR1 and SIGUSR2 are reserved for user-defined behaviors. They may be
redefined by an hg extension [1], but cannot be easily redefined for chg.
Since the default behavior (kill) is not that useful for chg, let's forward
them to hg, hoping it got redefined there and could be more useful.

[1] https://bitbucket.org/facebook/hg-experimental/commits/e7c883a465
2017-03-08 13:46:26 -08:00
Jun Wu
ddc5fd5cc8 chg: document why we send SIGHUP and SIGINT to process group
This makes the code more consistent - other signals are documented.
2017-03-08 13:34:25 -08:00
Jun Wu
bb3aea397e chg: verify XDG_RUNTIME_DIR
According to the specification [1], $XDG_RUNTIME_DIR should be ignored
unless:

  The directory MUST be owned by the user, and he MUST be the only one
  having read and write access to it. Its Unix access mode MUST be 0700.

This patch adds a check and ignores it if it does not meet part of the
criteria.

[1]: https://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html
2017-02-06 17:01:06 -08:00
Jun Wu
402691b892 chg: check snprintf result strictly
This makes the program more robust when somebody changes hgclient's
maxdatasize in the future.
2017-01-11 23:39:24 +08:00
Jun Wu
bb277b3ff3 chg: change server's process title
This patch uses the newly introduced "setprocname" interface to update the
process title server-side, to make it easier to tell what a worker is actually
doing.

The new title is "chg[worker/$PID]", where PID is the process ID of the
connected client. It can be directly observed using "ps -AF" under Linux, or
"ps -A" under FreeBSD.
2017-01-11 07:40:52 +08:00
Jun Wu
b61b02a865 chg: remove getpager support
We have enough bits to switch to the new chg pager code path in runcommand.
So just remove the legacy getpager support.

This is a red-only patch, and will break chg's pager support temporarily.
2017-01-10 06:59:39 +08:00
Jun Wu
5ae59a4110 chg: handle pager request client-side
This patch implements the simple S-channel pager handling at chg
client-side.

Note: It does not deal with environ and cwd currently for simplicity, which
will be fixed later.
2017-01-10 06:59:03 +08:00
Jun Wu
923ee6957d chg: check type read from S channel
The previous patch added the check server-side. This patch added it
client-side.
2017-01-06 16:14:52 +00:00
Jun Wu
734e02b02d chg: send type information via S channel (BC)
Previously S channel is only used to send system commands. It will also be
used to send pager commands. So add a type parameter.

This breaks older chg clients. But chg and hg should always come from a
single commit and be packed into a single package. Supporting running
inconsistent versions of chg and hg seems to be unnecessarily complicated
with little benefit. So just make the change and assume people won't use
inconsistent chg with hg.
2017-01-06 16:11:03 +00:00
Jun Wu
dfb8c98044 chg: add procutil.h
This patch adds a formal header procutil.h for procutil.c, and changes
Makefile to build procutil.c independently.
2017-01-02 14:57:14 +00:00
Jun Wu
8e6abc8b48 chg: let procutil maintain its own pagerpid
Previously, chg.c maintains the pagerpid. Let's move it to procutil.c.

Note: chg.c still have a pagerpid to decide whether to call attachio or not.
In the future, attachio may be moved from hgc_open to hgc_runcommand, and
hgc_runcommand handles both pager and attachio so we don't need to run
attachio twice. And chg.c will be free of pagerpid.
2017-01-02 14:43:37 +00:00
Jun Wu
4321520c94 chg: decouple hgclient from setuppager
procutil should not depend on hgclient. This patch makes the pager handling
part independent from hgclient.
2017-01-02 14:10:32 +00:00
Jun Wu
efbbb12a58 chg: decouple hgclient from setupsignalhandler
procutil should not depend on hgclient. This patch makes the signal handling
part independent from hgclient.
2017-01-02 14:04:35 +00:00
Jun Wu
fc44f74541 chg: move signal and pager handling to a separate file
In the future hgclient will deal with pager directly inside runcommand, so
related signal handling stuff needs to be decoupled from chg.c.

The signal handling and pager logic are coupled because we need to forward
SIGPIPE when pager exits. So they are moved together, otherwise a global
variable (pagerpid) is inevitable.

This patch moves related functions from chg.c to procutil.c, which was
marked as copied to maintain annotate history.

The move is done without code modification for easy review, therefore
`#include "procutil.c"` was introduced temporarily.
2017-01-02 14:02:47 +00:00
Jun Wu
2419c6679a chg: respect XDG_RUNTIME_DIR
$XDG_RUNTIME_DIR [1] is a better place for user daemons. Let's use it and
fallback to $TMPDIR.

After this patch, chg will try socket paths in the following order:

  1. $CHGSOCKNAME
  2. $XDG_RUNTIME_DIR/chg/server
  3. ${TMPDIR:-tmp}/chg$UID/server

[1]: https://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html
2016-12-26 00:02:42 +00:00
Jun Wu
4548a63b44 chg: make "get default sockdir" a separate method
The logic to get a default socket directory will become longer in the next
patch. So let's move it out.
2016-12-25 23:49:54 +00:00
Jun Wu
f437cb2c85 chg: handle connect failure before errno gets overridden
This patch moves the error handling logic up so that errno after connect
won't be overridden.
2016-12-25 23:32:11 +00:00
Jun Wu
d23c117db4 chg: support long socket path
This patch replaces UNIX_PATH_MAX (108) with PATH_MAX (4096) so we can have
long unix path.
2016-12-23 16:26:40 +00:00
Jun Wu
c5374c4a97 chg: remove sockdirfd
See the previous patch for the reason.
2016-12-23 16:16:44 +00:00
Jun Wu
8499ade54b chg: let hgc_open support long path
"sizeof(sun_path)" is too small. Use the chdir trick to support long socket
path, like "mercurial.util.bindunixsocket".

It's useful for cases where TMPDIR is long. Modern OS X rewrites TMPDIR to a
long value. And we probably want to use XDG_RUNTIME_DIR [2] for Linux.

The approach is a bit different from the previous plan, where we will have
hgc_openat and pass cmdserveropts.sockdirfd to it. That's because the
current change is easier: chg has to pass a full path to "hg" as the
"--address" parameter. There is no "--address-basename" or "--address-dirfd"
flags. The next patch will remove "sockdirfd".

Note: It'd be nice if we can use a native "connectat" implementation.
However, that's not available everywhere. Some platform (namely FreeBSD)
does support it, but the implementation has bugs so it cannot be used [2].

[1]: https://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html
[2]: https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-April/082892.html
2016-12-23 16:37:00 +00:00
Jun Wu
e912a360fc chg: remove locks
See the previous two patches for the reason. The advantage is a simplified
code base and better throughput when starting multiple servers with multiple
confighashes. The disadvantage is starting multiple servers in parallel with
a single confighash will waste some CPU time, which is probably fine in
common use-cases.

This makes it easier to switch to relative paths to support long unix domain
socket paths.
2016-12-19 22:15:00 +00:00
Jun Wu
24cb48448f chg: start server at a unique address
See the previous patch for motivation. Previously, the server is started at
a globally shared address. This patch appends pid to the address so it
becomes unique.

Note: with Linux pid namespace, the address may be non-unique, but it does
not affect correctness of chg - chg client will receive an redirection and
that's it.
2016-12-19 22:09:49 +00:00
Yuya Nishihara
30fe4722fb chgserver: make it a core module and drop extension flags
It was an extension just because there were several dependency cycles I
needed to address.

I don't add 'chgserver' to extensions._builtin since chgserver is considered
an internal extension so nobody should enable it by their config.
2016-10-15 14:30:16 +09:00
Yuya Nishihara
6c0a0f348a chg: just take it as EOF if recv() returns 0
hgc->sockfd is a blocking stream socket. recv() should never return 0 other
than EOF.

See 4ff8628991ea for the original problem.
2016-08-05 21:21:33 +09:00
Jun Wu
997125affe chg: forward SIGINT, SIGHUP to process group
These signals are meant to send to a process group, instead of a single
process: SIGINT is usually emitted by the terminal and sent to the process
group. SIGHUP usually happens to a process group if termination of a process
causes that process group to become orphaned.

Before this patch, chg will only forward these signals to the single server
process. This patch changes it to the server process group.

This will allow us to properly kill processes started by the forked server
process, like a ssh process. The behavior difference can be observed by
setting SSH_ASKPASS to a dummy script doing "sleep 100" and then run
"chg push ssh://dest-need-password-auth". Before this patch, the first Ctrl+C
will kill the hg process while ssh-askpass and ssh will remain alive. This
patch will make sure they are killed properly.
2016-07-17 22:55:47 +01:00
Jun Wu
ecf32d3bf2 chg: handle EOF reading data block
We recently discovered a case in production that chg uses 100% CPU and is
trying to read data forever:

  recvfrom(4, "", 1814012019, 0, NULL, NULL) = 0

Using gdb, apparently readchannel() got wrong data. It was reading in an
infinite loop because rsize == 0 does not exit the loop, while the server
process had ended.

  (gdb) bt
  #0 ... in recv () at /lib64/libc.so.6
  #1 ... in readchannel (...) at /usr/include/bits/socket2.h:45
  #2 ... in readchannel (hgc=...) at hgclient.c:129
  #3 ... in handleresponse (hgc=...) at hgclient.c:255
  #4 ... in hgc_runcommand (hgc=..., args=<optimized>, argsize=<optimized>)
  #5 ... in main (argc=...486922636, argv=..., envp=...) at chg.c:661
  (gdb) frame 2
  (gdb) p *hgc
  $1 = {sockfd = 4, pid = 381152, ctx = {ch = 108 'l',
        data = 0x7fb05164f010 "st):\nTraceback (most recent call last):\n"
        "Traceback (most recent call last):\ne", maxdatasize = 1814065152,"
        " datasize = 1814064225}, capflags = 16131}

This patch addresses the infinite loop issue by detecting continuously empty
responses and abort in that case.

Note that datasize can be translated to ['l', ' ', 'l', 'a']. Concatenate
datasize and data, it forms part of "Traceback (most recent call last):".

This may indicate a server-side channeledoutput issue. If it is a race
condition, we may want to use flock to protect the channels.
2016-07-18 18:55:06 +01:00
Jun Wu
2106a129bc chg: add pgid to hgclient struct
The previous patch makes the server tell the client its pgid. This patch
stores it in hgclient_t and adds a function to get it.
2016-07-17 23:05:59 +01:00
Yuya Nishihara
5bae87a105 chg: silence warning of unused parameter 'sig' 2016-06-28 22:39:06 +09:00
Jun Wu
dfebfdbf56 chg: send SIGPIPE to server immediately when pager exits (issue5278)
If the user press 'q' to leave the 'less' pager, it is expected to end the
hg process immediately. We currently rely on SIGPIPE for this behavior. But
SIGPIPE won't arrive if we don't write anything (like doing heavy
computation, reading from network etc). If that happens, the user will feel
that the hg process just hangs.

The patch address the issue by adding a SIGCHLD signal handler and sends
SIGPIPE to the server as soon as the pager exits.

This is also an issue with hg's pager implementation.
2016-06-24 15:21:10 +01:00
Yuya Nishihara
36bce4885c chg: ignore SIGINT while waiting pager termination
Otherwise the terminal would be left with unclean state. This is what
ea9ab1ca7e38 does.
2016-06-15 23:49:56 +09:00
Yuya Nishihara
ca4a132300 chg: reset signal handlers to default before waiting pager
Our signal handlers forward signals to the server process, but it will
disappear soon after hgc_close(). So we should unregister handlers before
hgc_close(). Otherwise chg would abort due to kill(perrpid, sig) failure.

The problem is spotted by SIGWINCH while waiting pager termination.
2016-06-15 23:32:00 +09:00
Jun Wu
18674dc0d2 chg: change default connect timeout to 60 seconds
As discussed at
https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-June/085290.html

The default 10-second timeout is not enough if the machine is overloaded.
Let's increase it to 60 seconds.
2016-06-15 21:36:31 +01:00
Jun Wu
eacf673970 chg: make timeout adjustable
Before this patch, chg will give up when it cannot connect to the new server
within 10 seconds. If the host has high load during that time, 10 seconds
is not enough.

This patch makes it adjustable using the CHGTIMEOUT environment variable.
2016-06-13 21:30:14 +01:00
Jun Wu
7c2dd6af83 chg: exec pager in child process
Before this patch, chg uses the old pager behavior (pre 55f6f7fb60d2), which
executes pager in the main process. The user will see the exit code of the
pager, instead of the hg command.

Like 55f6f7fb60d2, this patch fixes the behavior by executing the pager in
the child process, and wait for it at the end of the main process.
2016-06-11 20:25:49 +01:00
Yuya Nishihara
c61ec06302 chg: initialize sockdirfd to -1 instead of AT_FDCWD
As we don't use sockdirfd yet, this is the simplest workaround to compile chg
on old Unices where AT_FDCWD does not exist. Foozy pointed out Mac OS X 10.10
is required for AT_FDCWD as well as xxxat() functions.
2016-04-24 21:35:30 +09:00
Jun Wu
837ef17e43 chg: forward SIGWINCH to worker
Before this patch, if the user uses chg and ncurses interface, resizing the
terminal window will mess up its content.

This patch fixes the issue by forwarding SIGWINCH to the worker process.
2016-04-10 01:28:52 +01:00
Jun Wu
63d1d02bc9 chg: server exited with code 0 without being connectable is an error
Before this patch, if the server started by chg has exited with code 0 without
creating a connectable unix domain socket at the specified address, chg will
exit with code 0, which is not the correct behavior. It can happen, for
example, CHGHG is set to /bin/true.

This patch addresses the issue by checking the exit code of the server and
printing a new error message if the server exited normally but cannot be
reached.
2016-04-10 22:00:34 +01:00
Jun Wu
fda8134cfd chg: use fsetcloexec instead of closing lockfd manually
Since we have the fsetcloexec utility function, use it instead of closing
lockfd manually.
2016-04-11 00:18:27 +01:00
Jun Wu
f8e2eeaa6b chg: extract the logic of setting FD_CLOEXEC to a utility function
Setting FD_CLOEXEC is useful for other fds such like lockfd and sockdirfd,
move the logic from hgc_open to util.
2016-04-11 00:17:17 +01:00
Jun Wu
b8937b0692 chg: add fchdirx as a utility function
As part of the series to support long socket paths, we need to use fchdir and
check its result in several places. Make it a utility function.
2016-04-10 03:14:32 +01:00
Jun Wu
bfa0407ff7 chg: check lockfd at freecmdserveropts
We check for sockdirfd at freecmdserveropts but not lockfd, which is a bit
strange to people new to the code. Add a comment and an assert to make it
clear that lockfd should be closed earlier.
2016-04-10 22:58:11 +01:00
Jun Wu
625c1b8ab3 chg: add sockdirfd to cmdserveropts
As part of the series to support long socket paths, we need to add the fd of
the directory to the cmdserveropts structure so we can use basenames instead
of full paths for sockname, redirectsockname, and lockfile.
2016-04-10 23:56:00 +01:00
Jun Wu
29846cf694 chg: fix spelling in the error message about error waiting for cmdserver
This is a trivial spelling and grammar fix.
2016-04-10 21:56:05 +01:00